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Preface 


This volume is a collection of expanded papers selected from the 2013 International 
Symposium on Aviation Psychology (ISAP) that was held May 6-9 at Wright 
State University in Dayton, Ohio. 

The first ISAP was held in recognition of the unique and difficult challenges 
posed by the aviation environment to the field of applied psychology. Dr Richard 
Jensen convened the “First Symposium on Aviation Psychology” at Ohio State 
University in Columbus, Ohio in April of 1981. In the foreword to the proceedings 
of that first symposium the goals were clearly laid out, “The objective of this 
symposium was to critically examine the impact of high technology on the role, 
responsibility, authority, and performance of human operators in modern aircraft 
and air traffic control systems.” This was a very ambitious objective for a small 
conference held in America’s heartland. 

Nevertheless, the objective was met and the “First Symposium on Aviation 
Psychology” was a resounding success! There were 210 attendees for this first 
gathering and the Proceedings of the Symposium contained 43 papers and 
abstracts. Considered and debated were many of the central challenges of aviation 
such as cockpit display and control design, automation, selection, workload, and 
performance assessment. The meeting was also successful in attracting participants 
from the varied communities that have a stake in aviation psychology. There 
were attendees and presenters from academia, military, government regulatory 
agencies, and industry (including airframe manufacturers and airlines). And even 
in this first iteration of the meeting there was a healthy presence of international 
representation from such nations as Canada, Austria, Saudi Arabia, Germany, 
Great Britain, and Israel. 

A clear outcome of the First Symposium was the recognition that many 
challenges would remain and require diligent research in the future. It was also 
decided that a regular symposium on aviation psychology would be a very 
beneficial venue to encourage focus on aviation psychology’s evolving challenges 
and a forum to consolidate the findings and to sharpen the questions and debates 
central to the advance of a safe aviation environment. 

Consequently, a symposium has been held biennially since 1981. In 2003 the 
symposium was hosted in Dayton in conjunction with celebrations of the 100th 
Anniversary of the Wright Brothers’ first flight. Beginning that year, conference 
logistics have been managed through Wright State University. However, Dick 
Jensen continued to serve as the Symposium Chair until 2007. Following his 
retirement, it was decided that the symposium should continue. Since 2009, the 
symposium has been managed through a collaboration between the Department 
of Psychology at Wright State University and the Air Force Research Laboratory 


XXVI 


Advances in Aviation Psychology 


(AFRL) at Wright-Patterson Air Force Base. The continued success of the 
symposium could not have been possible without the support from the Air Force 
Office of Scientific Research for the 2009 and 2011 meetings. The present volume 
is a direct outgrowth of the seventeenth 1SAP held at Wright State University in 
2013. 

The 32 years-span separating the first and seventeenth symposium has 
witnessed both the enduring challenges and rapidly-changing technological 
advances confronting aviation psychology as well as evolving theoretical and 
methodological psychological paradigms in meeting these challenges (see Chapter 
1). Over the years, the conference has continued to focus on the objectives outlined 
for the original meeting in 1981—to critically examine the role of humans in the 
context of changing technologies and operational contexts. And the conference 
has continued to attract broad participation that spans research and operational 
communities, and that includes a strong international contingent. 

The present volume highlights the inherently intricate involvement of human 
interaction with a vast and complex aviation system in order to accomplish a mission 
that the human is ill-equipped to accomplish without significant technological 
support. For example, care must be taken that the demands placed on any individual 
or team do not exceed their capabilities. Consequently in aviation psychology the 
interface design is a major concern to ensure that the information needed by the 
human operator(s) is presented in understandable formats at an efficient rate of 
transmission. Importantly, the synergy of the human capabilit ies (some innate and 
many acquired via training) and the information provided via the human’s senses 
and the system’s displays must provide an understanding that can support effective 
decision making and control. In order to validate the success of interface designs 
and training regimens, aviation psychology has had to develop assessment tools 
to measure mental workload and situation awareness in relation to the impacts 
on operational effectiveness. To optimally support the human the system must at 
times utilize automation to take action without direct human control. However, 
this must be carefully managed in order to not disrupt the human’s understanding 
of what is happening. It has become clear that advances and improvements in 
automation change, but do not diminish, the importance of the role of humans in 
aviation systems. 

Along this vein, our Keynote Speaker, Dr Nancy Leveson presented an 
excellent overview of a systems perspective on aviation psychology illustrating 
how the human’s presence in a system contributes to or defends against errors and 
their consequences. Following this, another Keynote Speaker, Dr James Lackner, 
provided an intriguing description of his discovery of how human’s terrestrially- 
evolved senses are subject to illusions when subjected to the unique demands of 
aerospace flight. A third Keynote Speaker, Dr Max Mulder, explored some of the 
implications of advanced automation and increasing mission complexity for the 
design of flight deck displays. All three speakers have contributed chapters based 
on their keynote addresses. 
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The remaining chapters were selected from among the technical papers 
presented during the meeting. They reflect both the emerging and enduring 
challenges facing aviation psychology today. The chapter topics span from flight 
deck and air traffic control developments in preparation for NextGen operations; 
pilot factors, especially human aircrews interacting with each other or with 
automation; and exciting new approaches toward increasing the understanding of 
and training for modern aviation operations. 

We are especially proud to include two chapters whose lead authors were among 
students who competed in the Stanley N. Roscoe Best Student Paper Competition 
for the seventeeth 1SAP. Kathleen Van Benthem co-authored the chapter on 
“Individual Pilot Factors Predict Simulated Runway Incursion Outcomes” and Jan 
Comans co-authored the chapter on “Risk Perception in Ecological Information 
Systems.” We congratulate Jan for winning the Best Student Paper Award. 

Despite the dramatic changes in the technologies present within the aviation 
system, many of the challenges confronted by the chapters in this volume were 
foreshadowed within the 1981 proceedings. This is not surprising because 
operational effectiveness and safety still depend on coordination between 
technologies and humans. Developing the human-machine synergy is the enduring 
challenge of aviation psychology and the chapters of the current volume are 
excellent examples of some of the best contemporary approaches for addressing 
that challenge. 
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Chapter 1 

Aviation Psychology: Optimizing Human 
and System Performance 

Michael A. Vidulich 

Air Force Research Laboratory, USA 
Pamela S. Tsang & John M. Flach 
Wright State University, USA 


Aviation Psychology: Then and Now 

This chapter provides a quick historical review of the emergence of aviation 
psychology as a discipline and its important role not only in aviation applications, 
but also in the larger domains of scientific and applied psychology. This will lead 
into a discussion of some of the enduring and emerging challenges confronted 
within aviation psychology that are featured in subsequent chapters. 

Although by no means the only arena of engagement for human factors or 
applied psychology, aviation psychology can legitimately be considered one of 
the cradles and nurseries for those fields. This is not surprising because the field 
of aviation was one of the most visible inroads of modern technology into modern 
life. This visibility reflects the fact that flight was the realization of a seemingly 
impossible dream that stretched both human and technical capabilities to their 
limits. 

Even before the aircraft was officially bom, there was considerable variability 
in terms of assumptions about the role of the human pilot. For example, Lilienthal 
essentially bet his life on the athletic capabilities of humans to solve the 
control problem with catastrophic results. In contrast, Langley was so focused 
on engineering the aeronautical platform (that is, wings and engine) that he 
underestimated the need to support the piloting task with appropriate controls and 
training. The result was two unsuccessful launches of his aerodrome in 1903 that 
led to great pessimism about whether the dream of flight would ever be realized. 

It could be argued that the success of the Wright brothers later that same year 
was due to their methodical approach to all of the challenges associated with heavier 
than air flight. In particular, they understood that an effective human-machine 
interface (that is, the control system) and appropriate training (that is, extensive 
experience piloting kites and gliders) was an essential complement to innovations 


4 


Advances in Aviation Psychology 


with respect to the development of engines and wings. With continuing advances 
in technology and changes in concepts of air operations the debate about the role 
of humans in aviation systems continues to this day. 

An early challenge to the viability of commercial aviation involved the 
challenge of maintaining stability in low visibility conditions (for example, in 
clouds or at night) (Previc & Ercoline, 2004). It took the pioneering work of 
Ocker and Crane (1932) to convince the aviation community that this reflected 
fundamental limits of the human ability to sense orientation. Again, success 
required a blend of engineering—supplemental sensors and displays (for example, 
an artificial horizon), and psychology—training (that is, so that the pilots learned 
to trust the instruments). The joint capability of the human with the technology 
(that is, the Sperry artificial horizon) was demonstrated in the pioneering flights of 
Jimmy Doolittle in 1927 and Albert Hegenberger in 1932. However, this was only 
the start of a long progression toward the development of appropriate instruments 
and training for effective instrument flight. 

In the ensuing years the continued challenges of commercial and military 
aviation led to innovations in aeronautical systems, information systems, and 
in our understanding of human capabilities. Many attribute the work of the 
Cambridge Research Lab in the UK (led by Kenneth Craik and Frederic Bartlett) 
and the Aeromedical Research Lab in the US (led by Paul Fitts, see Fitts, 1947) 
to setting the stage for the modern fields of engineering psychology and aviation 
psychology. A major theme of this work was that the information processing 
limitations of the human components could be modeled in ways that could inform 
engineering decisions about the technological systems and, in particular about the 
interfaces between human and technology. 

The challenges of the aviation domain were not simply about “piloting.” 
It became increasingly obvious that the aviation system included a distributed 
network of air and ground components. One of the earliest examples of the design 
of a situation display for supporting team coordination and situation awareness can 
be seen in the British Royal Air Force Operations Rooms from the Battle of Britain 
days. The development of radar is widely acknowledged to be a breakthrough 
technology that enabled a leap in air defense effectiveness. As World War II 
approached, the British had been building a coordinated air defense as a result 
of their World War I experience (Rawlinson, 1924). It was a system of telephone 
communications from human observers using binoculars and sound detectors being 
brought together in an “operations room” where the opponent’s movements were 
displayed on a large table-top map display. The commander observing this display 
could then use another set of telephone lines to issue orders to flak batteries (some 
mobile) and fighter aircraft bases to defend specified areas and routes. The British 
Royal Air Force used these earlier experiences and added a network of radar and 
telephone systems to the overall air defense system. Large table-top maps with 
movable physical icons were updated to reflect input from the radar displays. Air 
control officers observing these situation displays and defensive squadron status 
displays on the walls from a balcony would attempt to identify the most efficient 
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way to deploy the limited fighter aircraft to counter them (Hough & Richards, 
1989). In other words, the key to the successful air defense was the creation of the 
display that presented the crucial information to the decision makers and means to 
convey those decisions to the appropriate forces. 

Following the war, the need to integrate air and ground operations continued 
to grow to enable groups of aircraft to accomplish their various missions in a 
growing expanse of space and expectation of speed. From the earliest air traffic 
controllers waving flags to instinct pilots when to land and take off and having 
only bonfires to signify air highways, to rotating light beacons and radar, to the 
present day satellite-based technology, air traffic control (ATC) has developed 
into an elaborate socio-technical system requiring collaboration among a widely 
distributed collection of people and technologies. 

As an offshoot of aviation, the space race leveraged aviation psychology to 
support operations beyond the atmosphere. One of the eerie echoes from the 
early days of aviation was the sometimes acrimonious debate regarding the role 
(if any) of the human astronaut in controlling the spacecraft (Mindell, 2008). 
Many engineers felt that only automated systems could be trusted to deal with 
the unknown conditions of vehicle control in space while the pilot and astronaut 
communities argued for more prominent roles for the human in the astronaut- 
spacecraft team. For example, Charles Donlan, NASA’s Associate Director 
of Project Mercury, and Jack Heberlig from the Office of the Project Manager, 
commented on the importance of the astronaut’s role in the Mercury program: 
“Of utmost practical importance to future manned space flight efforts will be 
the quality of the astronaut’s performance in space. Accuracy of manual attitude 
control should provide a good test of psychomotor performance capability. 
Monitoring of the capsule orbital systems should provide an indication of the 
vigilance and perceptual accuracy. Navigation will test reasoning and visual 
discrimination of earth terrain features and heavenly bodies outside the capsule” 
(Donlan & Heberlig, 1961, p. 37). The wisdom of providing for human astronaut 
control was verified when on the first US orbital flight, John Glenn’s Mercury 
spacecraft developed a malfunctioning thruster and he was able to use the manual 
control to compensate (Mindell, 2008). 

Debates regarding the best role allocation between humans and machines 
have not abated to this date (for example, Barnes & Jentsch, 2010; Billings, 1991; 
Parasuraman & Byrne, 2003; Tsang & Vidulich, 1989) and the issues have come to 
the forefront of aviation psychology with new challenges posed by the control of 
unmanned aerial vehicles (UAVs) and the implementation of the Federal Aviation 
Administration (FAA) Next Generation Air Transportation System (NextGen). 

In the area of UAVs, once again, the hopes for the technological side of the 
human-machine team are very high, as expressed by Cosenzo, Parasuraman, 
and de Visser (2010, p. 103): “The prevailing expectation in the robotics 
community is that autonomy will enable robots (air, ground, or sea) to function 
with little or no human intervention.” So far, these hopes are not being realized. 
To optimally support the human, the machine must at times take action without 
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direct human control. However, this must be carefully managed in order to not 
disrupt the human’s situation awareness or increase the human’s mental workload. 
Considerable human involvement in controlling and monitoring UAV operations 
is expected to continue not only because of the technological challenges, but also 
because of the legal and moral issues associated with deploying weapons systems 
(see, for example, Bowden, 2013). 

In the area of NextGen, wide-ranging changes are proposed and planned 
to prepare the US for the anticipated growing air travel demand through 2025. 
Sheridan (2009) outlined some of the significant changes in equipment and 
roles and responsibilities of pilots and ATC controllers that include the use of 
ADS-B technology (automatic dependent surveillance broadcast) that would 
provide more accurate latitude and longitude surveillance than radar; four¬ 
dimensional trajectories to be negotiated between airlines operations personnel, 
pilots, controllers and airport managers well before flight time; digital datalink 
as the major means of air-ground communication; and pilots assuming primary 
responsibility for self-separation while controllers assume more flow management 
responsibilities. As would be for any large-scale system changes, human errors 
and system failures will need to be anticipated not just for normal operations but 
also, and especially, for off-normal operations (Wickens, 2009; Wickens, Hooey, 
Gore, Sebok & Koenicke, 2009). 

The next section presents a few parallel developments in psychology that 
either have a significant impact on aviation or that are at least partly a result of a 
need to address problems confronted in aviation. 


The Aviation and Psychology Symbiosis 

As noted above, the Wright Brothers’ success was due not only to their ingenuity 
in the mechanical engineering of aerodynamics but also to their recognition of 
the need for human control. Thus, developing or training piloting skill was an 
essential component of the Wright Brothers’ research program (for example, they 
made the analogy to learning to ride a horse or a bike). Orville Wright began 
training students in 1910 and established several training sites shortly after. The 
consequence of the lack of training was painfully clear when the US Army Air 
Corp was asked to take over the mail delivery service after an air mail scandal 
in 1934. In addition to their pilots having limited experience with winter flying, 
night-time flying, and instrument flying, most of the planes were poorly equipped. 
In 78 days of operation, 66 accidents resulted in 12 fatalities. Mail service was 
returned to the airline industry within a few short months. But training alone was 
not always sufficient. The disastrous results of training without selection during 
World War I revealed the need not just for selection for physical fitness but also 
selection for personality, emotional stability, and cognitive abilities (Armstrong, 
1939; see Carretta & Ree, 2003 for a more current review). 
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The challenges of aviation that psychologists wrestled with during World War 
II further spurred the synthesis of aviation and psychology. The recognition that 
the then prevailing dominant psychological approach of behaviorism was ill- 
equipped to provide practical solutions to the many military problems led to the 
development of a number of new approaches to psychology. 

For example, James Gibson, working in the US Army Air Force Aviation 
Psychology Program developed visual aptitude tests for screening pilot applicants. 
He also explored the possibility of increasing the effectiveness of training films in 
the 1940s (Gibson, 1947). Inspired in part by Langewiesche’s (1944) descriptions 
of how pilots use structure in optical flow fields to make judgments about approach 
and landing, Gibson rejected classical approaches to space perception to consider 
the possibility of “direct perception” for the control of locomotion. Many of these 
insights were first presented to psychology in the books The Perception of the 
Visual World in 1950 and The Ecological Approach to Visual Perception in 1979. 
Gibson postulated that properties of objects and surfaces in the world are all 
perceived directly and are not inferred from sensations or mediated by cognitive 
processes. 

Broadbent (1958) also recognized the need for more basic knowledge of 
how information is processed and developed a very different approach. He was 
instrumental in bringing the tools of communication science and information 
theory as a framework for quantifying human capacity (for example, bandwidth) 
in terms that were compatible with metrics used for characterizing the 
performance of automated systems (for example, bits/s). Building on computer 
and communication system metaphors, human cognition was modeled as a system 
of information processing stages. 

One of the most widely known psychological facts is the magical number 
seven that refers to the limited amount of information one could hold in one’s 
short-term memory. More than half a century later, there is little to refute the 
hard limit of one’s short-term memory to a few chunks of information. But much 
underappreciated is the significance of the concept of chunking that George Miller 
discussed in his 1956 paper (Baddeley, 1994). As hard as the capacity limit of 
short-term memory is, the size of a chunk can be variable and is limited only by 
one’s knowledge base and strategy of forming meaningful units of information, 
enabling an effectively variable-capacity short-term memory. This was a significant 
departure from the behavioristic view of the dominance of environmental factors 
in determining behavior and fostered the idea that information processing does not 
necessarily begin and end in the environment. Today, efforts to study and model 
strategic processing abounds in areas such as signal detection (Sperandio, 1978); 
visual scanning (Bellenkes, Wickens & Kramer, 1977; van de Merwe, van Dijk 
& Zon, 2012), decision making (Schriver, Morrow, Wickens & Talleur, 2008), 
resource allocation (Iani & Wickens, 2007), management of mental workload 
(Adams, Tenny & Pew, 1995), and acquisition of expertise (for example, Adams 
& Ericsson, 1992). 
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The strategic and adaptive nature of human control did not escape those 
modeling the pilot-aircraft system. During the 1950s, the US Air Force sponsored 
a major effort to determine the transfer function (relating the pilot’s control stick 
output) to the error input (discrepancy in pitch, roll, or yaw relative to a given 
reference setting on the attitude indicator) in order to be better able to predict the 
overall system performance (Sheridan, 2010). McRuer and Jex (1967) discovered 
that whatever the controlled element dynamics the human operator would adapt 
and adjust his own transfer function in order to achieve low error and a high degree 
of system stability. The ability of pilots to adapt to substantial variations in flight 
dynamics is well characterized in the study of adaptive manual control (Kelley, 
1968; Wickens, 2003; Young, 1969). 

In addition to capturing the human adaptability, the optimal control model 
was developed to more explicitly take into account the overall system constraints 
and certain known human limitations (such as perceptual and motor noise and 
processing delay) while allowing the operator to strategically meet operation 
criteria (such as maximizing fuel efficiency versus optimizing time of arrival). 
The optimal control model was developed to model not just flight control but a 
wide variety of tasks including team performance and flight management that have 
assumed an increasingly more significant role since the burgeoning of automated 
systems in the cockpit. One application is the PROCRU (procedure-oriented crew) 
model that has individual models for each crew member (pilot flying, pilot not 
flying, and second officer) and covers a range of activities that include monitoring, 
continuous and discrete control, situation assessment, decision making, and 
communication (Baron, Muralidharan, Lancraft & Zacharias, 1980). 

But what Baron (1988) considered to be the ultimate human capacity is the 
ability to predict and to behave in an open-loop manner. Engineering indeed has 
solved many problems associated with manual control and advanced technologies 
now provide support for augmenting the human senses (for example, night- 
vision goggles and head-up displays (HUDs) that enable precise aircraft position 
information and out-the-window information to be available simultaneously). 
Contrast for example, the “simplicity” of control of the 1913 Wright Flyer (Figure 
1.1) and the present day Boeing 787 Dreamliner (Figure 1.2). The higher-level 
executive functioning of the human, such as that of anticipating and predicting, 
however, has stubbornly defied accurate modeling to the point where automation 
could replace the human completely except for the most predictable operations. 
In fact, increasing capabilities often lead to increasing mission complexity and an 
increased probability of unanticipated variability. This unanticipated variability can 
require creative problem solving beyond the capacity of “rule-based” automated 
systems. Though slow in comparison to computer processing, the human is still a 
valuable resource for creatively adapting to unanticipated situations. 

Recognizing the unique ability of human experts to deal with “unanticipated 
variability,” there has been increased attention to the human-machine interface as 
a “representation” supporting creative problem solving. Constructs like “Direct 
Manipulation” for example, Shneiderman, 1992; Hutchins, Hollan & Norman, 
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Figure 1.1 Orville Wright standing in river with Wright Model CH Flyer, 1913 

Source: Courtesy of Special Collections and Archives, Wright State University. 



Figure 1.2 Boeing 787 Dreamliner flight deck 

Source: Photo by Dan Winters © 
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1986) and “Ecological Interface Design” (for example, Bennett & Flach, 2011) 
reflect attention to the role of the interface for representing the “deep structure” of 
work domains in ways that facilitate productive thinking about difficult problems. 
As pilots are confronted with increasingly complex cockpits, the need for careful 
interface design will increase as well. 

Herbert Simon (1979), a pioneer in cognitive science, advocated the use of 
computer simulation to study human cognitive processes at the level of neural 
processes, at the level of elementary information processes (for example, visual 
scanning, memory retrieval), and at the level of higher mental processes (for 
example, decision making and problem solving). Simon and Newell first developed 
the computational theories of the human cognitive architecture (Newell. Shaw & 
Simon, 1958; see Gluck (2010), for a review of the currently studied architectures). 

Although Simon focused on the higher mental processes as opposed to the 
neural level, development at the neural level was by no means neglected in the 
field. For example, in 1978 McDonnell Douglas Astronautics Company organized 
a conference under the sponsorship of the Defense Advanced Research Projects 
Agency (DARPA) entitled Biocybernetic Applications for Military Systems 
(Gomer, 1980). At the conference, a wide variety ofpsychophysiological measures 
(such as eye-tracking and pupilometrics, steady state electroencephalogram 
(EEG) and evoked potentials, electro-dermal responses) were reviewed for their 
potential applicability to military environments, especially military aviation. An 
expressed interest of the conference was the utility of these measures to provide, 
on a moment-to-moment real-time basis, information about the human that could 
be used to enhance system performance in some way. One goal was to go beyond 
the traditional manner of closing the human-machine loop that uses displays to 
inform the human of the environment (including information from or about the 
machine) and controls for the human to effect the machine’s operation. But this 
way of closing the human-machine loop leaves a very important loop unexploited. 
The machine has little information about the state of the human. It was the hope 
that the physiological indices could potentially inform the machine concerning 
the state of the human such that function allocation between the human and the 
machine may be implemented dynamically. 

Some three decades later, technology has afforded much improved means of 
studying the cognitive processes at the neural level. See for example the many 
neurophysiological and brain imaging techniques currently used to understand 
and assess brain functioning that are discussed in Kramer and Parasuraman 
(2007) and Parasuraman (2011). The value of knowing the operator state has 
been demonstrated in a number of applications such as adaptive automation in the 
cockpit (Parasuraman & Byrne 2003), in ATC (Wilson & Russell, 2003), and in 
UAV control (Christensen & Estepp, 2013; Parasuraman, Cozenzo & De Visser, 
2009; Wilson & Russell, 2007). With adaptive automation, some behavioral or 
physiological indices are used to invoke different levels of automation according 
to some pre-established algorithm. For example, as workload level (indicated 
by the operator state) or performance reached an unacceptable level, additional 
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automated aiding will be provided to ensure satisfactory system performance. As 
workload level or performance returns to an acceptable level, automation aid will 
be reduced or withdrawn in order to keep the human operator active in the loop 
and abreast of the situation as much as possible. 

One emerging development goes a step further. Recent research suggests 
that more direct augmenting of the human’s capabilities might be within reach. 
Parasuraman and Galster (2013) describe a “Sense-Assess-Augment” framework 
to address the questions of when and how to provide the augmentation. The 
individual or team cognitive or functional state will first need to be sensed via 
behavioral or neural measures. These measures will then be assessed relative to 
certain performance requirements. Augmentation will be provided if needed to 
optimize mission effectiveness. 

The general steps of this method involve first using neurophysiological studies 
(such as functional Magnetic Resonance Imaging (fMRI) and event related 
potential (ERP) studies) to identify the key brain regions associated with the 
cognitive function of interest. For example, when data quality is poor in a visually 
degraded condition, action understanding and threat detection could be improved 
with some top-down attention guidance. Noninvasive brain stimulation could then 
be applied to cortical regions critical for action understanding, threat detection, 
and attention control in order to accelerate learning and enhance performance (see 
Falcone, Coffman, Clark & Parasuraman, 2012). 

Two brain stimulation methods that have received some empirical scrutiny and 
have shown some potential for improving overall human-machine aviation system 
effectiveness are transcranial magnetic stimulation (TMS) and transcranial direct 
current stimulation (tDCS). See Fox (2011) and McKinley, Bridges, Walters, and 
Nelson (2012) for reviews of their current developments. 

The present overview has covered a century or so of developments in aviation 
operations and their scientific basis in psychology from before the first powered 
flight in 1903 to the centennial of the first commercial flight in 2014. There is no 
question that great advances have been made in both domains and neither could 
have gone nearly as far by itself as each has done by borrowing and impacting 
each other’s theoretical, empirical, and practical discoveries. Much has changed 
but there are principles and sage advice that endure. Gopher and Kimchi (1989) 
persuasively argued that the most efficacious way to meet the challenges of rapidly 
changing technological developments in today’s and in tomorrow’s environment 
is not to react to every momentary novelty but to focus on developing theoretical 
models and principles that could be generally applied. Simon (1990, p. 2) reminded 
psychologists to keep in mind the many models of different sciences that have 
proven to be productive and offered the following as a way forward. 

Psychology does not much resemble classical mechanics, nor should it aim to do 
so. Its laws are, and will be, limited in range and generality and will be mainly 
qualitative. Its invariants are and will be of the kinds that are appropriate to 
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adaptive systems. Its success must be measured not by how closely it resembles 
physics but by how well it describes and explains human behavior. 

Consistent with Simon’s advice and as discussed above, quantification of human 
performance has many advantages and is especially useful for model and theory 
building and implementing engineering solutions. But it is not the only tool 
available. Well-validated qualitative accounts of human behavior can be equally 
powerful if only in a different sort of way. Advances in aviation psychology will 
no doubt use both approaches in the future, just as it has in the past. 
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Hazard analysis is at the heart of system safety. It can be described succinctly as 
“investigating an accident before it happens.” A hazard is selected, such as two 
aircraft violating minimum separation standards or an aircraft losing sufficient lift 
to maintain altitude, and then the scenarios that can lead to that hazardous state are 
identified. Hazards are informally defined here as precursor states to accidents that 
the designer never wants the system to get into purposely. The resulting scenarios 
or potential paths to the hazard are then used to compute the probability of the 
hazardous state occurring or to design to either eliminate the scenarios or to control 
or mitigate them. Alternatively, after an accident, hazard analysis techniques can 
generate the potential scenarios to assist accident investigators in determining the 
most likely cause. 

Most of the current widely used hazard analysis methods were created 50 or 
more years ago when the systems being built were simpler and were composed 
primarily of electro-mechanical components. Human operators mostly followed 
pre-defined procedures consisting of discrete and cognitively simple tasks such 
as reading a gauge or opening a valve. Failure rates and failure modes could be 
determined through historical usage or through extensive testing and simulation. 
Humans were either omitted from these calculations or were assumed to “fail” in 
the same way that electro-mechanical components did, that is, randomly and with 
an identifiable probability. Safety engineers and human factors experts existed 
in separate worlds: the safety engineers concentrated on the hazardous scenarios 
involving the physical engineered components of the system and human factors 
experts focused on the human operator such as training and the design of the 
physical interface between the human and the engineered system. 

As software was introduced to increase functionality and desired system 
properties (such as efficiency and fuel savings), the role of the operator changed 
from one of direct controller to supervisor of the automation that actually flew the 
plane. The increasing complexity led to new types of human error (Sarter & Woods, 
2008) and stretched the limits of comprehensibility for both the designers and the 
operators of these systems. We are now designing systems in which operator error 
is inevitable, but still blame most accidents on the pilots or operators. Something 
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then is either done about the operator involved, such as fire them or retrain them, 
or engineers do something about operators in general, such as marginalizing them 
further by automating more control functions or rigidifying their work by creating 
more rules and procedures, many of which cannot be followed if the system is to 
operate efficiently (Dekker, 2006). 

At the same time, the hazard analysis methods were not updated to take into 
account the new types of accident scenarios that were occurring and to treat the 
operator as an integral part of the larger system. As a result, hazard analyses often 
miss possible scenarios, especially those involving software or humans. To make 
progress, we need the psychology, human factors, and engineering communities 
to come together to create more powerful hazard analysis methods—and therefore 
ways to improve the system design—that are appropriate for the systems being 
built and operated today. This chapter describes a potential approach to doing that. 
It starts from an extended model of accident causality called STAMP (System- 
Theoretic Accident Model and Processes) that better describes the role humans 
and software play in accidents today (Leveson, 2012). 

In the next section, STAMP and an associated new hazard analysis method 
called System-Theoretic Process Analysis (STPA) are described along with the 
resulting implications for more sophisticated handling of humans in engineering 
analysis and design. Proposed changes to ATC (NextGen) are used as an example. 
Then open questions are described in which the aviation psychology community 
could provide important contributions. 


How Are Accidents Caused? 

Traditional safety engineering techniques are based on a very old model of accident 
causation that assumes accidents are caused by directly related chains of failure 
events: failure A leads to failure B which causes failure C, which leads to the loss. 
For example, the pitot tubes freeze, which causes the computer autopilot to stop 
operating (or to operate incorrectly), followed by a stall warning that is incorrectly 
handled by the pilots, which leads to the plane descending into the Atlantic. This 
chain of events is an example of an accident scenario that might be generated by a 
hazard analysis. The underlying model of causality implies that the way to prevent 
accidents is to prevent these individual failure events, for example, train pilots 
better in how to react to a stall warning and improve the pitot tube design. 

The chain-of-events model served well for simpler systems, but our more 
complex, software-intensive systems are changing the nature of causality in 
accidents. Software does not fail randomly and, in fact, one could argue that it 
does not fail at all. Software is an example of pure design without any physical 
realization. How can an abstraction fail? It certainly can do the wrong thing at 
the wrong time, but almost always accidents related to software are caused by 
incorrect requirements, that is, the software engineers did not understand what the 
software was supposed to do under all conditions, such as when false readings are 
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provided by the pitot tubes. In the same way, human contributions to accidents are 
also changing, with the rise in importance of system design factors, such as mode 
confusion, that cannot be explained totally by factors within the human but instead 
result from interactions between human psychology and system design. Many 
accidents today are not caused by individual component failure but by unsafe and 
unintended interactions among the system components, including the operators. 

The STAMP model of accident causality was created to deal with the new 
factors in accidents and to consider more than individual or multiple component 
failure in causal analysis (Leveson, 2012). Accidents are treated not as a chain of 
component failure events but as the result of inadequate enforcement of constraints 
on the behavior of the system components. In this case, the system includes the 
entire socio-technical system. Figure 2.1 shows an example of a typical hierarchical 
safety control structure in aviation. Each component in the structure plays a role 
in accident prevention and, therefore, in accident causation. The control structure 
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on the left ensures that safety is built into the system (for example, aircraft) and 
the control structure on the right ensures that the systems are operated safely. 
There are usually interactions among them. Each of the components in Figure 
2.1 has a set of responsibilities or safety constraints that must be enforced by that 
component to prevent a hazard. 

Figure 2.2 shows the hierarchical control structure (omitting the upper levels 
for simplicity) involved in a new ATC procedure called In-Trail Procedure (1TP) 
that allows aircraft to pass each other over the Atlantic airspace even though 
minimum separation requirements may be violated temporarily during the 
maneuver. Information about the location of both aircraft is provided through 
Global Positioning System (GPS) and ADS-B and the ITP equipment onboard the 
aircraft determines whether passing will be safe at this point. If the ITP criteria for 
safe passing are met, the pilots can request a clearance to execute the maneuver. 
A hazard analysis of this system woidd attempt to generate the scenarios in which 



Figure 2.2 The safety control structure for ITP 
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1TP could lead to an accident. That information can then be used by engineers and 
human factors experts to try to prevent accidents either through system design 
changes or operational procedures. 

An important component of STAMP is the concept of a process model (see 
Figure 2.3). The safety control structure is made up of feedback control loops 
where the controller issues commands or control actions to the controlled process, 
for example, the pilot sends a command to the flight computer to ascend. In order 
to operate effectively, every controller must have a model of what it thinks is the 
state of the subsystem it is controlling. The actions or commands that the controller 
issues will be based at least partly on that model of the state of the system. If this 
model is incorrect, that is, inconsistent with the real state of the system, then the 
controller may do the “wrong” thing in the sense it is the right thing with respect 
to the information the controller has but wrong with respect to the true state of 
the system. If the pilots or the ATC controller has an incorrect understanding of 
whether the criteria for safe execution of the 1TP are met, for example, they may 
do the wrong thing even though they have not themselves “failed” but simply were 
misled about the state of the system. 



Figure 2.3 Every controller contains a model of the state of the 
controlled process 

The process model is kept up to date by feedback and other inputs. In humans, 
the process model is usually considered to be part of the mental model. Note that 
the feedback channels are crucial, both in terms of their design and operation. 

While this model works well for software and is certainly a better model for 
how humans work than that of random failure, it can be improved with respect to 
accounting for human factors in accidents. Some ideas for achieving this goal are 
presented later. But first, the implications of the present model are considered and 
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the results of using it in hazard analysis compared with traditional hazard analysis 
methods. 

There are four types of unsafe control actions that can lead to an accident. 

1. A command required for safety (to avoid a hazard) is not given. For example, 
two aircraft are on a collision course and neither Terrain Collision Avoidance 
System (TCAS) nor an ATC Controller issues an advisory to change course. 

2. Unsafe commands are given that cause a hazard. An example is an ATC 
Controller issuing advisories that put two aircraft on a collision course. 

3. Potentially correct and safe commands are given, but at the wrong time (too 
early, too late, or in the wrong sequence). For example, TCAS provides a 
resolution advisory for the pilot to pull up too late to avoid a collision. 

4. A required control command is stopped too soon or continued too long. For 
example, the pilot ascends as directed by a TCAS resolution advisory but 
does not level off at the required altitude. 

Although classic control theory and control commands are emphasized here, 
the model is more general in terms of accounting for other types of controls on 
behavior than just a physical or human controller in a feedback loop. For example, 
component failures and unsafe interactions may be controlled through design 
using standard engineering techniques such as redundancy, interlocks, or fail-safe 
design. System behavior may also be controlled through manufacturing processes 
and procedures, maintenance processes, and operations. A third and important 
type of control over behavior comes through social controls, which may be 
governmental or regulatory but may also be cultural values, insurance, the legal 
system, or even individual self-interest. The goal of design for safety is to create 
a set of socio-technical safety controls that are effective in enforcing the behavior 
required for safety while at the same time allowing as much freedom as possible 
in how the non-safety goals of the system are achieved. 


Identifying Hazardous Scenarios 

STPA is a new hazard analysis method based on the STAMP accident causation 
model. It works as a top-down system engineering process that starts with system 
hazards and then identifies behavioral constraints that must be imposed on the 
system components in order to ensure safety. It also assists safety analysts and 
system designers in identifying the set of scenarios that can lead to an accident. 
In practice, STPA has been found to identify a larger set of scenarios than found 
by traditional hazard analysis techniques, such as fault trees, event trees, and 
failure modes and effects analysis, particularly with respect to those scenarios 
involving software or human behavior (for example, Balgos, 2012; Fleming, 
Spencer, Thomas, Leveson & Wilkinson, 2013; Ishimatsu et al., 2014; Pereira, 
Lee & Howard, 2006). 
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To understand how STPA works, consider the ITA example (RTCA, 2008). 
The STPA process first identifies the types of unsafe control actions that can lead 
to particular hazards and then uses that information and the control structure to 
identify the causes or scenarios that could lead to the unsafe control action. In the 
previous section, four general types of unsafe control action were identified. These 
are listed across the top of Table 2.1. The flight crew has two types of control actions 
they can provide (column 1): an action to execute the 1TP and an action to abort 
it if they believe that is necessary. Within the table, the types of hazardous control 
actions are listed, for example, executing the ITP when the ATC Controller has not 
approved it or executing it when the criteria for safe passing are not satisfied. The 
actual process (along with automated support) to create the table are beyond the 
scope of this chapter but the reader should be able to see easily how this could be 
accomplished. A complete ITP analysis can be found in Fleming et.al. (2013). 

Table 2.1 Potentially unsafe control actions by the flight crew 


Controller: 
Flight Crew 

Not Providing 
Causes Hazard 

Providing Causes 
Hazard 

Wrong Timing/ 
Order 

Causes Hazard 

Stopped Too 
Soon/Applied 

Too Long 

Execute ITP 


ITP executed 
when not 
approved. 

ITP executed 
too soon before 
approval. 

ITP aircraft 
levels off above 
requested FL. 



ITP executed 
when criteria are 
not satisfied. 

ITP executed 
too late after 
reassessment. 

ITP aircraft 
levels off below 
requested FL. 



ITP executed with 
incorrect climb 
rate, final altitude, 
etc. 



Abnormal 
Termination 
of ITP 

Flight crew 
continues with 
maneuver in 
dangerous situation. 

Flight crew aborts 
unnecessarily. 

Flight crew 
does not 
follow regional 
contingency 
procedures while 
aborting. 




Once the unsafe control actions have been identified, their potential causes 
are identified using the generic types of failures or errors that could occur in the 
control loop as shown in Figure 2.4. The information about the potential causes 
can then be used for system design to eliminate or reduce them, create operational 
procedures, design training, and so on. For example, consider the reasons for why 
the flight crew might execute the ITP maneuver when it has not been approved 
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or when the criteria are not satisfied. There are a lot of such reasons, but many 
are related to the flight crew’s mental model, that is, they think that the approval 
has been given (when it has not) or they think the criteria are satisfied when they 
are not. Some scenarios involve the crew getting incorrect information, different 
sources give conflicting information, misperceptions about what information they 
have received, and so on. These scenarios (reasons) are used to design protection 
against the unsafe behavior by the flight crew and to create detailed requirements 
for the design of the system. 


Controller 


Provided Control Action 
(inappropriate, ineffective 
or missing) 


Control input or 
extermal information 
wrong or missing 


Control Algorithm 

(Raws in creation, 
process changes, 
incorrect modification 
or adaptation) 


Process Model 

(inconsistent, 
incomplete, or 
incorrect) 


Missing or wrong 
communication with 
another component 


Inadequate 

operation 


Received Control Action 
(Delayed, etc.) 


Sensor 


Received Feedback 
(Inadequate, missing, 
or delayed) 


Inadequate 

operation 


Controlled Process 


Controller 2 

Component failures 

Changes over time 



Conflicting control actions 

Process input 



missing or wrong 


Unidentified or 
out-of-range 
disturbance 


Provided Feedback 
(Incorrect, no information 
provided, measurement 
inaccuracies, delays) 


Process output 
contributes to 
system hazard 


Figure 2.4 Generic types of problems in a general control loop that could 
lead to unsafe control 

An important question, of course, is whether STPA is better than the traditional 
hazard analysis methods that are being used for NextGen. The official hazard 
analysis for ITP uses a combination of fault tree and event trees (RTCA, 2008). 
Probabilities are assigned to human error through a subjective process that involved 
workshops with controllers and pilots and eliciting how often they thought they 
woidd make certain types of mistakes. 

As an example, one of the faults depicts the scenario for executing the ITP 
even though the ITP criteria are not satisfied. The fault tree analysis starts with an 
assigned probabilistic safety objective of 1.63e-3 per ITP operation at the top of 
the tree. Three causes are identified for the unsafe behavior which is approving an 
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1TP maneuver when the distance criterion is not satisfied: (1) the flight crew does 
not understand what the ITP minimum distance is; (2) ATC does not receive the 
1TP distance but approves the maneuver anyway; or (3) there are communication 
errors (partial corruption of the message during transport). Probabilities are 
assigned to these three causes and combined to get a probability (1.010e-4) for the 
top event, which is within the safety objective. 

The goal in the official risk assessment is to determine whether the maneuver 
will be within the assigned safety objective and not to improve the design. The 
fault tree analysis gives no guidance on how to prevent the human errors but 
instead assumes they happen arbitrarily or randomly. The fault tree also assumes 
independent behavior, but the interaction and behavior of the flight crew and 
ATC may be coupled, with the parties exerting influence on each other or being 
influenced by higher-level system constraints. Finally, the analysis asserts that 
communication errors are due to corruption of data during transport (essentially a 
hardware or software error), but there are many other reasons for potential errors 
in communication. 

The STPA results include the basic communication errors identified in the 
fault tree, but STPA also identifies additional reasons for communication errors 
as well as guidance for understanding human error within the context of the 
system. Communication errors may result from confusion about multiple sources 
of information (for either the flight crew or ATC), from confusion about heritage 
or newly implemented communication protocols, or from simple transcription or 
speaking errors. There is no way to quantify or verify the probabilities of any 
of these sources of error for many reasons, particularly because the errors are 
dependent on context and the operator environments are highly dynamic. Instead 
of assuming that humans will rarely “fail,” the STPA analysis assumes they will 
make mistakes and specifies safety and design requirements accordingly. 


Possible Extensions to System-Theoretic Process Analysis (STPA) for 
Human Factors 

While STPA as defined above is proving in a lot of comparative studies to be 
better than traditional hazard analysis techniques, it needs to be improved. The 
first step would be to provide a less naive model of the human controller. While 
humans do not fail like mechanical components, they also do not operate with 
fixed algorithms (procedures) like computers as assumed above. Figure 2.5 shows 
a more realistic model of the role of humans in STAMP. 

There are three levels of control shown in Figure 2.5. The bottom two, that 
is, the controlled process and an automated controller, are the same as shown 
previously. The top level is a first attempt at a more sophisticated model of the 
behavior of a human controller. Rather than having a fixed control algorithm (or 
procedure) that is always strictly followed, humans generate control actions using 
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a model of the controller process, a model of the automated controller, a model 
of the context in which the control is taking place as well as written or trained 
procedures. 

Leveson (2012) has identified some basic design principles using this model 
to reduce human controller errors, for example, ways to support the controller in 
creating and maintaining an accurate mental model of the controlled process and 
of the automation. Known problems, such as mode confusion are included. While 
these design principles are not unknown in aviation psychology, they are restated 
in a way that engineers can apply them directly to their designs. These principles 
could and should be expanded. 

Another important improvement would be to extend the STPA process 
to include more fundamental human factors concepts. The resulting analysis 
could have important potential implications for providing engineers with the 
information necessary to design systems that greatly reduce the types of human 
error contributing to accidents. 



Disturbances 


Figure 2.5 An extension to include a more realistic model of 
human behavior 
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Conclusions 

Engineering needs to get beyond greatly oversimplifying the role of humans in 
complex systems but the aviation psychology community will need to help them. 
Hazard analysis and system design techniques that were created 50 years ago are 
no longer useful enough. This chapter has described a new, expanded model of 
accident causation, STAMP, based on systems thinking that could be the start 
for engineers and human factors experts to work together to create much safer 
systems. 
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Chapter 3 

An Earthbound Perspective on Orientation 
Illusions Experienced in Aerospace Flight 

James R. Lackner 

Brandeis University, USA 


My colleagues and 1 study human movement control, orientation, and adaptation in 
unusual force environments. Being able to work in these conditions is, besides being 
scientifically rewarding, enormous fun. Our work has been supported for many 
years by the Air force Office of Scientific Research. Several years ago, Dr Willard 
Larkin, our program officer, after thinking about the wide range of observations 
on human performance we had and were carrying out, suggested that we develop 
a model that provided a unifying explanatory basis for them. This challenge led us 
to develop a new model of orientation in which the vestibular system is calibrated 
and its output interpreted in relation to the lg force acceleration of Earth gravity. 
The model predicts the results of earlier studies of exposure to non- lg acceleration 
levels and the illusory changes in visual and self-orientation associated with these 
exposures. It also makes unique predictions that were verified experimentally. 
This chapter describes the background experiments and scientific journey with its 
twists and turns that led to this new model and viewpoint. 

The semicircular canals and otolith organs of the vertebrate vestibular system 
are often likened to an inertial navigation system (Mayne, 1974). Such systems 
typically have integrating triaxial angular and linear accelerometers. An important 
feature of inertial navigation systems is that intermittently their calibration needs to 
be updated and this is done in relation to Earth, using Earth’s gravity to determine 
whether the platform remains horizontal (Feynman, Gottlieb & Leighton, 2013). 
In the vertebrate inner ear, the three orthogonally-oriented semicircular canals 
on either side of the head are activated by angular but not linear acceleration. 
Their output is proportional to angular velocity, and that output when integrated 
provides an estimate of angular displacement. The two otolith organs on either 
side, the utricle and saccule, respond to linear acceleration and their output is 
dependent on the resultant of the gravitational and inertial acceleration acting on 
them, the gravitoinertial acceleration (GIA). The output of the vestibular system 
must be periodically calibrated as well to ensure accurate interpretation of body 
orientation. 

A wide range of illusions—including the G-excess, the oculogravic, and 
the “giant hand”—have been related to “excess” or unusual otolith stimulation 
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while the oculogyral, audiogyral, and somatogyral illusions have been related to 
semicircular canal activity (Benson, 1999; Clark & Graybiel, 1949, a,b; Ercoline, 
DeVilbiss, Yauch & Brown, 2000; Gillingham 1992; Gillingham & Previc, 1996; 
Graybiel & Hupp, 1946; Graybiel, 1952; Graybiel & Brown, 1951; Clark & 
Graybiel, 1966; Miller & Graybiel, 1966). These illusions actually all involve 
conjoint errors in the representation of body orientation and in the localization 
of sensory stimuli (DiZio, Held, Lackner, Shinn-Cunningham & Durlach, 2001; 
Lackner & DiZio, 2010). For example, during the catapult launch of an aircraft 
from a carrier the resultant gravitoinertial force on the pilot will be tilted forward 
and this gives rise to the postural and visual illusion that the aircraft nose is pitched 
upward. 

Our work initially was not directed toward understanding how the vestibular 
system is calibrated and how this relates to illusions evoked by unusual patterns of 
vestibular stimulation. Instead, it had to do with a more mundane problem: why do 
astronauts experience space motion sickness (SMS) in the weightless conditions 
of space flight? To study this issue we carried out experiments in parabolic flight 
where periods of high force (~1.8 g) and free fall (Og) alternated, as shown in 
Figure 3.1. The important point is that “Og” in this context means being in a state of 
free fall—both the test subjects and the aircraft are falling together. Consequently, 
in the Og phase of flight the aircraft exerts no forces on the bodies of the test 
subjects. The reason we normally experience body weight is because on Earth 
we are supported by a surface of some sort except momentarily during running or 
jumping when both feet are off the ground. The support surface exerts a reaction 
force on our body and the magnitude of that force—under stationary conditions— 
reflects our body weight. D’Alembert’s reformulation of Newton’s Second Law as 

F + F - ma = 0 

g e 

is the best way to understand the physiological effects of weightlessness, where 
F is the force of gravity, F is the sum of all external forces on the body, and ma 
is the inertial force. Being weightless corresponds to a situation in which F c = 0 so 
that F g = ma, and, “Pulling Gs” occurs when F e /F g >1 so that F g +F e = ma > lmg. 

It had been known since the earliest space missions that head movements 
in weightless conditions are provocative (Titov & Caidin, 1962). With longer 
duration missions, SMS became a significant concern, affecting nearly 70 percent 
of all astronauts and cosmonauts during their first several days in-flight (Graybiel 
et al., 1975a, Graybiel, Miller & Homick, 1975b, 1976; Lackner & DiZio, 2006). 
In military aviation, airsickness is commonly evoked by head movements when 
pulling Gs. The provocativeness of head movements in aerospace conditions 
presented us with a quandary, however, as we tried to identify etiological factors 
in SMS. The quandary related to the results of the most systematic study of motion 
sickness ever conducted in space flight, the Skylab M-131 experiment in the three 
manned Skylab missions (Graybiel et al., 1975b, 1977). That experiment involved 
having seated, blindfolded, astronauts rotating at constant velocity make repeated 
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3 WEIGHTLESS ««00 

Figure 3.1 Flight profile of an aircraft flying parabolic maneuvers to 

generate alternating periods of high force and weightlessness 

head movements out of the axis of rotation. Such head movements produce an 
unusual pattern of activation of the semicircular canals (activating all three on 
each side of the head) known as Coriolis cross-coupling stimulation. On Earth, 
such cross-coupling stimulation is very provocative. Rotating at 25rpm, most 
individuals can only make a few head movements without experiencing symptoms 
of motion sickness. 

During pre-flight baseline assessments all nine astronauts who took part in the 
Skylab M-131 experiment had been highly susceptible to Coriolis cross-coupling 
stimulation. They had to terminate testing prematurely because of experiencing 
nausea prior to making the scheduled 150 head movements in the test protocols. 
Thus, all were sensitive to cross-coupling stimulation of the semicircular canals. 
However, when tested in space flight, all were insusceptible to motion sickness. They 
could make the maximum number of head movements in the test protocol, even at 
higher velocities of rotation than during the laboratory tests, without experiencing 
any symptoms of motion sickness. This finding was totally unexpected. Figure 3.2 
shows the performance of the three astronauts on the fourth Skylab mission. 

The Skylab M-131 experiment had been expected to be so provocative that 
participating in it might incapacitate an astronaut for an entire day. Consequently, 
the in-flight testing was delayed until on or after Mission Day 6. In the first 
few days of their missions, the astronauts had found that simply making head 
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Figure 3.2 The height of the columns in the figure shows the magnitude of motion sickness experienced by the three 

Skylab 4 astronauts pre-flight, in-flight, and post-flight during exposure to Coriolis cross-coupling stimulation. 
The velocity of rotation r/min (RPM), number of head movements, direction of rotation clockwise or 
counterclockwise, and test day (-» pre-flight, + »in-flight, R » post-flight) are shown at the bottom 
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movements per se could elicit symptoms, later these same head movements were 
non-provocative. When tested, on and after Mission Day 6, they were insusceptible 
to motion sickness when making head movements during rotation, the same head 
movements that had been so provocative during rotation on Earth. 

Figure 3.3 illustrates what happens to the semicircular canals when a pitch head 
movement is made during rotation. During acceleration to constant velocity, the 
horizontal semicircular canals will initially be activated and then their neural discharge 
levels will decay back to baseline. At time B, the subject is rotating at constant velocity 
and will feel stationary. At time C, the subject makes a pitch back head movement. 
This takes the horizontal semicircular canals out of the plane of rotation and they lose 
angular momentum and signal prolonged yaw motion of the head; the “roll canals” 
are brought into the plane of rotation receiving an angular momentum impulse and 
signal prolonged roll of the head; only the “pitch canals” accurately signal the motion 
of the head. This complex pattern of stimulation is very nauseogenic and disorienting. 
Astronauts refer to it as “tumbling the gyros.” Importantly, the Skylab astronauts all 
reported that in-flight the head movements no longer tumbled their gyros even with 
head movements made at higher velocities of rotation than on Earth. 
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Figure 3.3 Illustration of the pattern of activation of the semicircular 
canals when a subject is accelerated to constant velocity and 
after a time lapse makes a pitchback head movement 
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One consequence of being weightless is the absence of hydrostatic pressure 
in the circulatory system, which results in a substantial rostral redistribution of 
blood and lymph in the body, and gives rise to the puffy faces associated with 
being in space flight. Such fluid shifts were thought to be a potential cause of 
SMS by altering pressures in the vestibular system and creating a situation akin 
to labyrinthine hydrops. This sensitization was thought to be part of the reason 
head movements in weightlessness were initially provocative in Skylab missions. 
Importantly, fluid shifts also occur on Earth whenever we change our orientation 
in relation to gravity. With the body horizontal, hydrostatic pressure is minimized 
because the effective height of the fluid column is the diameter of the long arteries 
and veins. We decided to evaluate the fluid shift hypothesis by positioning subjects 
in an apparatus that allowed us to rotate them “barbecue spit fashion,” but with a 
more humane method of attachment, about their long body axis while they were 
horizontal or tilted 10 degrees head up or down with respect to the horizontal. 
Rotation at constant velocity generated a continuous sweeping stimulation of the 
otolith organs by the force of gravity and varying the tilt angle significantly varied 
the magnitude of fluid shift. 

We rotated subjects at 30rpm and tested them blindfolded in straight-and- 
level flight and in the weightless and high force periods of parabolic flight. To our 
surprise, in lg all subjects experienced not barbeque spit rotation but rather an 
orbital motion while always facing in the same direction—up or down. One orbit 
was traversed each time they rotated 360 degrees (Lackner & Graybiel, 1978a). 
Figure 3.4 shows the relationship between actual body rotation and apparent 
orbital position for face-up and face-down experienced orbits. The direction of 
travel in the orbit was always opposite to that of the direction of actual rotation. 

The 180 degree shift in body position for face-up versus face-down orbital 
configurations made us wonder whether touch and pressure cues on the body from 
the apparatus were determining experienced patterns. We proved this to be the 
case by having subjects voluntarily press against the apparatus in different ways 
to change the touch and pressure forces on their body. By doing so, they could 
drastically alter their apparent orientation. Putting pressure on the top of their head 
could make them feel upside down as they “went” through their orbital motion 
(Lackner & Graybiel, 1978b). Increasing the pressure on their head would make 
the diameter of their orbital path larger and increase their apparent orbital velocity. 
This was also reflected in the eye movements elicited by their body rotation. 
Each time orbital velocity increased nystagmus slow phase eye velocity increased 
although the stimulation of the otolith organs remained the same. 

When the aircraft went from lg straight-and-level flight to 2g, subjects 
experienced a doubling or more of their orbit. Their experienced orbit could be 
larger than the circumference of the fuselage of the Boeing KC-135 aircraft in 
which they were being tested. Simultaneously, an enormous increase in orbital 
velocity was experienced because one orbit was still “completed” for each 
revolution of the apparatus. As the aircraft went into the weightless phase of flight 
subjects experienced a remarkable change. As the g-level decreased, their apparent 
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Figure 3.4 The upper row illustrates the physical orientation of a 

blindfolded test subject being rotated at 30rpm clockwise 
about the recumbent Z-axis. The bottom row illustrates the 
subject’s perceived counterclockwise orbit in relation to actual 
physical orientation 

orbit shrunk until in Og they felt perfectly stationary although they were rotating at 
30rpm. Moreover, although consciously aware of their orientation and location in 
the aircraft, most lost their sense of “spatial anchoring” to their environment. They 
were consciously aware of the actual location and configuration of their body in 
relation to the aircraft but did not have a sense of being in a particular location 
relative to it. They felt non-oriented and non-located spatially, not disoriented 
(Lackner & Graybiel, 1979). 

These findings pointed to a profoundly important influence of touch and 
pressure cues on perceived body orientation. During Z-axis recumbent rotation 
in non-weightless conditions, these stimuli rather than the otolith signals were 
determining perceived orientation. We later found that typically when a person is 
free floating in weightlessness, and closes his or her eyes, all spatial sense of being 
in a specific orientation in relation to the vehicle is lost (Lackner, 1992, 1993). 
However, contact with the fuselage—even light touch of a fingertip—brings 
back a sense of spatial orientation to the vehicle (Lackner & DiZio, 1993). Thus, 
we found that in weightless conditions, where there are no shear forces on the 
otolith organs, that they do not convey a sense of head orientation in relation to the 
vehicle, but mechanical contact cues can provide one. 

Our experiments, however, provided no support for the notion that the absence 
of hydrostatic pressure in the circulatory system during orbital flight contributes 
to SMS (Graybiel & Lackner, 1979). Varying fluid shift magnitudes did not affect 
motion sickness susceptibility. Since we knew that initially head movements were 
provocative in space flight, we next decided to look at how different types of head 
movements—yaw, pitch, and roll—affected susceptibility to motion sickness in 
the different force phases of parabolic flight. Exposure to alterations in force level 
per se can be provocative so we first tested subjects while they were seated with 
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head and body fixed stably in position to determine their baseline susceptibility. 
We then in subsequent flights had them make head movements in the weightless or 
high force periods of the parabolas. The results were unequivocal: head movements 
in a high force background were much more provocative than head movements in 
weightless conditions, but head movements made in Og elicited more sickness 
than the baseline parabolas not involving head movements (Lackner & Graybiel, 
1984b, 1985, 1986a, 1987). Moreover, across force levels, pitch head movements 
were most provocative, roll movements less so, and yaw movements still less so. 
These findings corresponded with the space flight findings that head movements 
can induce symptoms in a majority of astronauts and cosmonauts during their 
initial flight days until adaptation takes place. However, the findings cast no light 
on why head movements during rotation woidd not be provocative in space flight. 

To attack the issue directly, we had subjects make off-axis head movements 
during constant velocity body rotation in parabolic flight to determine how 
virtually immediate changes in force level affected provocativeness. The results 
were stunning: executing head movements during rotation in weightlessness was 
no more provocative than making the same head movements when not rotating. 
Head movements in 2g were intensely provocative and disorienting, much more 
so than in lg straight-and-level flight. We also had subjects rotating at constant 
velocity make head movements during force transitions and found that the effect of 
g-level on provocativeness was immediate. Head movements during the transition 
from lg to Og were less provocative than those made in the transition from Og to 
lg. They were also less disorienting (Lackner & Graybiel, 1984b, 1986a). 

But the enigma remained: why were head movements during rotation varying in 
provocativeness as a function of g-level because the activation of the semicircular 
canals should be independent of background force level? Two findings provided 
essential clues. One was that the path of the head during a pitch head tilt during 
rotation seemed to be deviated laterally relative to the torso in a scalloping fashion 
that could not be predicted from the stimulation of the semicircular canals (Lackner 
& DiZio, 1992). This made us wonder whether the Coriolis force acting on the 
head as it moved off the axis of rotation was a significant factor. The other clue 
was that in Og the sense of simultaneous motion about multiple axes following a 
head movement was absent. 

To pursue these issues we first conducted studies in our slow-rotation-room 
(SRR) with subjects seated at the center of rotation. In this circumstance, head 
movements and arm movements made during rotation generate transient Coriolis 
forces but miniscule centrifugal forces (Lackner & DiZio, 1994, 1997, 1998). 
Coriolis forces (F ) are a function of the velocity of rotation (co), the mass (m), of 
the moving object and its velocity (v) relative to the rotating reference frame: F c 
= -2m(co*v). The velocity profile of voluntary body movements is typically bell 
shaped, consequently, the Coriolis force generated will also be bell shaped and act 
on the moving head or arm in the direction opposite that of rotation. 

We soon found that the paths of head movements made in the rotating 
room were influenced by the Coriolis forces acting on the head coupled with 
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vestibulocollic reflexes resulting from the complex pattern of semicircular canal 
stimulation. With repeated head movements, the displacement component related 
to the Coriolis force acting on the head as an inertial mass rapidly adapted and 
within 40 movements no longer produced a deviation. By contrast, the semicircular 
canal contribution took much longer to adapt (Lackner & DiZio, 1998). 

To understand further the role of the Coriolis forces on movement control 
we measured how accurately subjects could point to targets during rotation. 
We thought of the arm as an appendage with mass, like the head, but without a 
vestibular system to complicate the interpretation of the findings. We developed 
a very simple paradigm. Subjects seated at the center-of-rotation made reaching 
movements to targets before, during, and after exposure to constant velocity rotation 
at lOrpm (Lackner & DiZio, 1994). The results are illustrated in Figure 3.5 and 
show that pre-rotation subjects reached straight and accurately to the target. Their 
initial reaches during rotation were deviated in the direction opposite the Coriolis 

Overhead view of reaching paths 



••• Pre-rotation 
oo o Per-rotation 
Post-rotation 


Figure 3.5 Illustration of reaching movements of subjects to a target 

before, during, and after exposure to lOrpm counterclockwise 
rotation in a fully-enclosed slow-rotation room. Complete 
adaptation occurs within 40 reaches. At that time the Coriolis 
force generated is no longer perceptually salient 
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forces generated and missed the target. With additional reaches, movement paths 
became progressively straighter and more accurate. Within 40 reaches, baseline 
reaching trajectories and accuracy were regained. The initial reaches post-rotation 
showed a pattern exactly opposite to the initial per-rotation reaches, thus indicating 
the persistence of an adaptive compensation gained during rotation that was no 
longer relevant, which then gradually decayed restoring reaches back to baseline 
patterns. Per-rotation after full adaptation had been achieved the subjects no 
longer consciously felt the Coriolis forces acting on their arm. Their reaches felt 
completely normal. Post-rotation, when they reached, it felt as if Coriolis forces 
were again deviating their arm. What they were then sensing as an “external force” 
was their own central nervous system’s (CNS) compensation for an expected, but 
actually absent, Coriolis force. 

These findings made us wonder whether Coriolis forces of significant 
magnitude were generated on the arm and head during every day activities. 
Because, if these forces were novel, why would individuals adapt so rapidly? 
Consequently, we looked at the situation in which subjects turn their torso and 
simultaneously reach to a target. We were curious whether during such turn and 
reach (T&R) movements individuals woidd stagger their torso rotation and arm 
reach in order to minimize the generation of Coriolis forces. We found this not to 
be the case (Pigeon, Bortolami, DiZio & Lackner, 2003 a,b). The Coriolis forces 
generated by T&R movements were much greater than those on the arm in our 
SRR experiments. Nevertheless, the T&R movements were straight and accurate, 
even when movements were made at very high torso (200°/s) and arm (1600 
mm/s) peak velocities. The peak velocities of arm and torso were also attained 
within 70 msec of each other thus enhancing the magnitude of the Coriolis force 
on the reaching arm. During T&R movements, the self-generated Coriolis forces 
were never consciously sensed by the subjects who were simply unaware of their 
presence. 

We found, that if subjects picked up a substantial weight and made a rapid 
T&R movement, they reached accurately and compensated appropriately for the 
extra Coriolis force generated by the increased effective mass of the arm, and did 
so with either arm despite never having held the object before (Pigeon, DiZio 
& Lackner, 2013). To be able to do this the subjects’ CNSs must be predicting 
the forthcoming trajectories of arm and torso and generating a compensation for 
the impending Coriolis force that will be created by their motion. Much of this 
compensation is “feed-forward” in nature, anticipating and countering the effect 
that the forthcoming Coriolis force generated will have on the trajectory of the arm 
(Pigeon et al., 2013). 

Figure 3.6 shows an apparatus that we have developed for re-mapping the 
relationship between torso rotation relative to the feet and relative to external 
space (Hudson, DiZio & Lackner, 2005). The subject stands on a servo-controlled 
circular platform mounted on a powerful motor. Rate sensors are attached to 
the subject’s torso and can be used to control the platform. The experiment is 
conducted in the dark and the subject simply rotates his/her body to face targets 
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mounted on a surface attached to the platform as they are illuminated. One target 
extinguishes with the onset of torso motion and then comes back on when the 
subject’s movement is completed. Then it goes out and a second target is lit that 
the subject turns to, and the sequence is repeated. The rate sensors on the subject’s 
body are used to control the platform and either negative or positive gains can 
be introduced but only negative gain will be described here. It was introduced 
gradually in .05 increments while the subjects made multiple torso rotations to 
the targets before the gain was changed again, until -.5 was achieved. In this 



Figure 3.6 Apparatus used to re-map the relationship between voluntary 
rotation of the torso relative to the feet and torso displacement 
with respect to external space 
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circumstance, when a subject rotated 60 degrees counterclockwise with respect 
to his/her feet to face a platform fixed target, the platform rotated 30 degrees 
clockwise. The subject was making a 60 degree torso rotation relative to his/her 
feet but only a 30 degree rotation with respect to external space. However, the 
subject always experienced the platform as being stationary (because the gain 
changes were introduced very gradually, below detection threshold) and perceived 
a 60 degree spatial displacement of the head and torso with respect to external 
space although the horizontal semicircular canal signal was elicited by a 30 degree 
rotation of the head not one of 60 degrees. This pattern means that the sense of 
spatial displacement was being determined by a foot-relative reference frame, not 
a space-based one related to the semicircular canal signal generated. 

We found that a subject adapted to -.5 platform gain makes a T&R movement 
immediately after platform gain is returned to zero—so that the platform does 
not rotate when the subject turns—two things happen. The subject turns about 
50 percent of the intended amplitude and the reaching arm makes a huge angular 
error, overshooting the target. These errors result because after the subject 
had adapted to the progressive negative gain of the platform, less and less 
voluntary torque was required to turn the body relative to the feet. The platform 
had turned progressively more in the opposite direction increasing the torso- 
relative to feet displacement. Consequently, much less force was generated than 
necessary to turn the torso to the target when platform gain was returned to 0. 
The feedforward motor compensation issued for the anticipated Coriolis force 
on the arm however was based on the expected velocity profile of the torso. 
Consequently, the magnitude of the compensation initiated for the expected 
Coriolis far exceeded its actual value and drove the arm past the angular location 
of the target. When a subject adapted to -.5 gain of the platform uses a pointer to 
indicate his or her voluntary angular spatial rotation, a 60 degree displacement 
will be indicated for a 30 degree rotation relative to space. This observation 
means that a foot-relative reference frame is being used to interpret or calibrate 
the signals from the semicircular canals. 

This study provided an avenue to understanding an observation we had made 
many years before. The time constant of the cupula-endolymph system of the 
horizontal semicircular canals in the human is on the order of 6-7 sec. However, 
the nystagmus and sense of rotation induced by a step velocity change to the 
canals has a time constant of 12-15 sec. This difference is attributed to “velocity 
storage,” the idea that the canal afferent signals that reflect head velocity project 
both to the vestibular nuclei and to the parapontine reticular formation where the 
signal is integrated. The semicircular canal velocity signal on being integrated 
serves as an indication of the displacement of the head; it codes the magnitude of 
space relative head rotation. 

We had observed earlier that when an individual rotating at constant velocity in 
the dark is exposed to a step velocity change, that the resulting nystagmus and sense 
of rotation are highly gravitoinertial force dependent. In weightless conditions, the 
time constant of nystagmus was closer to that of the cupula-endolymph system, 
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and was greatly reduced compared to lg and 2g test conditions (DiZio, Lackner 
& Evanoff 1987 a,b; DiZio & Lackner, 1988). This pattern means that velocity 
storage is greatly reduced or abolished in weightless conditions (DiZio & Lackner, 
1989). One of the features of space flight is that astronauts often lose track of 
where they are in their spacecraft—especially if they awaken in the dark or enter a 
chamber in an unusual orientation, one not aligned with the “architectural vertical” 
of the chamber. The absence of velocity storage is consistent with such spatial 
errors (Lackner & DiZio, 2000). 

If velocity storage were suppressed or absent in weightlessness, then 
an individual rotated to a new position in weightless conditions should not 
experience a spatial angular displacement. We tested this hypothesis in parabolic 
flight using an apparatus that rotated recumbent subjects in yaw about their Z 
axis at accelerations well above canal thresholds. The findings were conclusive: 
blindfolded subjects rotated in Og sensed an initial pull or tug in the direction of 
rotation but experienced no change in spatial angular position. By contrast, in both 
lg and 1.8g backgrounds, they could indicate accurately their change in orientation 
(Bryan, Bortolami, Ventura, DiZio & Lackner, 2007). These observations provided 
the key to understanding why head movements during rotation are innocuous in 
weightless conditions. Because of the absence of velocity storage, individuals do 
not experience prolonged head motion and displacement about multiple axes. Their 
gyros are not being tumbled because the velocity signals from the semicircular 
canals are not being integrated to provide spatial displacement signals. Coupled 
with the fact that in-flight testing did not begin until Mission Day 6, when head 
movements per se were no longer provocative the absence of velocity storage in 
Og explains the results of the Skylab M-131 experiment. 

As mentioned at the beginning of this chapter, the general view of the vestibular 
system has been that it functions like inertial navigation systems that have triaxial 
angular and linear accelerometers. However, unlike inertial navigation systems, 
the vestibular system only has two types of linear accelerometers—the utricle and 
saccule oriented perpendicular to each other. Otolith organ response is determined 
by gravitoinertial force magnitude and direction and many experiments have 
shown that during static tilt about the roll and pitch body axes when GIA level 
is increased above lg, the amplitude of experienced body tilt increases as well 
(Correia, Hixon & Niven, 1968). These results confirm the notion that the otolith 
organs are linear acceleromotors activated by gravity and inertial accelerations 
responding to their vector resultant. 

It is important to ask, however, why would terrestrial creatures evolve a system 
to respond to gravitoinertial force level? Most only experience non-lg force levels 
momentarily when they are walking, running, jumping, swimming, flying, or 
diving. Force changes are associated with self-initiated activities in the context 
of a static lg background force of gravity, or when wind or water currents are 
displacing or impeding the body. Experience with vehicles has come much later, 
both for humans and animals, and one consequence has been motion sickness— 
even for fish that are passively transported in water tanks on boats. 
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These considerations made us wonder whether otolith organ activity might 
be interpreted in relation to a lg background standard (Bortolami, Rocca, Daros, 
DiZio & Lackner 2006; Bortolami, Pierbon, DiZio & Lackner, 2006). Taking this 
perspective we developed a new model of vestibular function that interprets static 
otolith sheer forces in relation to the background force of gravity. It computes the 
head tilt that in lg would produce the shear forces on the otolith organs that are 
actually present. The model is shown in Figure 3.7. It predicts the results of all 
previous work on the effects of GIA on apparent body tilt in pitch and roll, the 
only axes that had been studied. For a weightless environment, the model does 
not generate an output or orientation with respect to space but relies on tactile 
stimulation of the body to disambiguate the direction of up and down. In the 
absence of tactile stimulation, no orientation is specified, a result that coincides 
with experimental findings that we had made years before. 

Otolith model for determining roll, pitch and yaw orientation 

Goal: estimate cp relative to head coordinates, in three axes 

Postulate 1: the estimate is based on two independent projections of f, p R and p P 

Postulate 2: projections are interpreted as if they were produced by 1 g (internal model) 




Yaw - based on p R , p p 



f¥| 
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Figure 3.7 The figure shows how the shear forces on the otolith organs, 
the utricles and the saccules illustrated on the lower left, are 
used to compute head orientation in a terrestrial framework, a 
lg internal representation 

Because of its assumption that otolith signals are interpreted in relation to the 
lg gravity of Earth, the model makes a prediction that no other model can. Namely, 
when an individual is recumbent and positioned in different yaw angles about the 
Z-axis, if background g-level is increased, then an increase in apparent body tilt 
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will not occur. This unique prediction results because the model computes the 
angle of recumbent yaw tilt from the inverse tangent of the ratio of the shear 
forces in pitch and roll on the otolith membranes. The sheer forces increase with 
higher G1A level but the computed yaw angle does not, because the effect of g and 
specific force (f) cancel in the ratio (see Figure 3.7). 

We tested this prediction in parabolic flight experiments and fully confirmed 
it. The results are shown in Figure 3.8 and are striking (Lackner & DiZio, 2009). 
There is no difference as a function of increased force level. The model also 
explains something no other model does. Labyrinthine-defective subjects often 
experience oculogravic illusions, just as subjects with normal vestibular function, 
but of a lower magnitude (Graybiel, Miller, Newsom & Kennedy, 1968). Our 
model by interpreting GIA in relation to a lg standard and incorporating tactile 
input as a disambiguating factor predicts precisely this pattern. 

Feynman’s description of how inertial navigation systems are calibrated 
using the direction of Earth gravity has enabled us to make sense of a wide 

Recumbent yaw (N=6) 

'60 



LED RED 

Head yaw angle relative to vertical (deg) 

Figure 3.8 Results of six subjects tested in parabolic flight when at 30 

degrees and 60 degrees left ear down (LED) or right ear down 
(RED) or at 0 degrees (horizontal). The lg and 1.8g results 
are virtually identical as predicted by the model shown in 
Figure 3.7 
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range of experimental findings that took many years to decipher. It pointed to the 
importance of using Earth gravity to calibrate the inertial navigation system. We 
realized in retrospect that this is also how vestibular linear acceleration signals are 
interpreted, in relation to a lg background acceleration. The conclusions that have 
been derived from our experiments and modeling include: 

1. The vestibular system does not contribute to a sense of body orientation in 
weightless conditions. 

2. Integration of semicircular canal signals to provide a sense of angular body 
displacement does not occur in weightless conditions. 

3. The absence of velocity integration in weightless conditions explains 
why Coriolis cross coupling stimulation—as in the Skylab M-131 
experiment—is not provocative in weightless conditions. 

4. Touch and pressure cues associated with dynamic patterns of body motion 
can override otolith signals and determine apparent body motion and 
patterns of compensatory eye movements. 

5. Velocity integration of semicircular canal signals to generate apparent 
angular displacement of the body has to be recalibrated on an ongoing 
basis to maintain accuracy. A mechanism for such re-calibration is through 
voluntary body turning and is based on a foot-based or foot-relative 
reference frame. Such updating is rapid and is going on continuously in 
everyday life. 

6. Otolith signals can also be calibrated in a foot-based reference system. 
The body is basically bilaterally symmetric and the patterns of pressure 
distribution on the feet coupled with the relative loading of the two legs 
serve as an indication of where the body center of mass is with respect to 
the direction of gravity. 

7. Our model of vestibular function is based on using the lg acceleration of 
Earth gravity as a standard against which to interpret the forces acting on 
the otolith organs. The model relies on contact forces on the body surface 
to disambiguate the direction of “up” and “down” and makes novel 
predictions that have been experimentally confirmed in parabolic flight 
studies. In the absence of shear otolith forces, no orientation is specified 
and velocity storage of semicircular canal signals does not occur. 

8. Creatures on Earth evolved vestibular systems in a context in which the 
acceleration of Earth gravity was omnipresent and departures from lg were 
encountered only during self-locomotion and passive motion resulting 
from wind and water currents. Exposure to unusual force backgrounds only 
occurred with the use of animals and vehicles for transport, initially horses, 
camels, oxen, rafts, and then boats, and railroads, and aircraft and now 
rocket ships. Passive transport is typically associated with the evocation 
of motion sickness and even disorientation until adaptive compensations 
occur to restore homeostatic control. 
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NextGen refers to the transformation of the National Airspace System through 
satellite-based air traffic management and technological innovations that will 
enhance trajectory precision, communications, and weather forecasting. Included 
in this new infrastructure will be a shift of some roles and responsibilities from 
ground to air. Implications of NextGen changes for flight crews are likely to 
be profound, and much if not most recent aviation research has been directed 
toward these issues. NASA and FAA have the greatest research and development 
responsibilities for NextGen, particularly in the area of identifying and responding 
to human factors-related issues. Information concerning research and findings 
must be shared across these agencies in order to achieve a seamless integration of 
work to achieve the anticipated NextGen operational improvements. To this end, 
we created a database of NASA and NASA-sponsored research related to NextGen 
flight deck issues and operations. The database includes 339 documents describing 
important NASA or NASA-sponsored research related to NextGen flight deck 
issues and operations, produced by NASA or NASA-funded researchers in the 
years 2006-2012. 

Documents are products of NASA’s Airspace and Aviation Safety Program 
efforts to identify and resolve key flight deck human factors issues in NextGen, 
challenges to efficient operations, areas in which technological advances are 
predicted to facilitate NextGen operations, research findings that can be used to 
develop NextGen procedures, and the potential impacts of off-nominal events. 
In this chapter, we summarize some of the main points culled from the database, 
organized in terms of operations issues and changes, enabling technologies, and 
human factors issues. We discuss changes or enhancements from current operations 
to planned NextGen operations, the degree of success thus far in implementing 
an operational concept or technology and next steps in implementation, research 
findings or identified holes in the research including missing methodologies or 
variables, and human factors issues and/or solutions for the NextGen flight deck. 
Only a limited summary is included in this chapter due to space constraints; 
however the NASA NextGen Flightdeck Literature Database spreadsheet can be 
downloaded from http://online.sfsu.edu/kmosier/. 
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Operations Issues and Changes 

Trajectory-Based Operations (TBO) 

The concept of operations (conops) for NextGen assumes TBO—the use of 
precise four-dimensional trajectories (4D, including a time component). TBO will 
involve ground-based computer systems with knowledge of nearby aircraft that 
would aid in scheduling and separation of those aircraft (FAA, 2009). Datalink 
communications technology would enable the uploading of strategic trajectories 
and trajectory negotiation (for example, Coppenbarger, Mead & Sweet, 2009; 
Mueller & Lozito, 2008). Several NASA research efforts have demonstrated 
the concept’s potential to increase flight path efficiency and runway throughput 
through human-in-the-loop (HITL) simulations involving both flight crews and 
ATC Controllers (for example, Johnson et ah, 2010; Prevot, Callantine, Homola, 
Lee & Mercer, 2007). Additionally, through the manipulation of an aircraft’s 4D 
trajectory and/or its required time of arrival at a designated waypoint, TBO can 
be used to resolve potential conflicts and aid in avoidance of adverse weather 
conditions (Wu, Duoig, Koteskey & Johnson, 2011). One concern regarding the 
implementation of TBO is the technological requirements for participating aircraft. 
Recent studies have demonstrated that current technologies and procedures, 
specifically the Flight Management System (FMS), datalink and TCAS (Traffic 
Alert and Collision Avoidance System), may be sufficient for running TBO but 
may not fully realize the potential for increased efficiency without additional 
technological enhancements. Multiple new technologies such as TBO-A1D (TBO- 
Adaptive Information Display; Bruni, Jackson, Chang, Carlin & Tesla, 2011) and 
the future air navigation system (FANS) have been designed to aid flight crews in 
the use of TBO without increasing workload (for example, Coppenbarger et ah, 
2009); however many aircraft currently in operation are not equipped with these 
technologies. 

Departures and arrivals 

Much of the NASA research on arrivals and departures focuses on RNAV (Area 
Navigation) departure and arrival procedures, CDAs (Continuous Descent 
Arrivals; for example, Johnson et ah, 2010), and airborne spacing techniques (for 
example, Barmore, Bone & Penhallegon, 2009) as methods to increase runway 
throughput. Recent NASA research also focuses on optimized descents computed 
by FMSs (Stell, 2009, 2011) and flight deck precision-spacing automation for 
fuel-efficient arrivals (Cabrall, Callantine, Kupfer, Martin & Mercer, 2012). The 
Collaborative Virtual Queue concept (CVQ; Burgain, Feron & Clarke, 2009), which 
uses virtual queuing to prevent runway back-ups and enable last-minute flight 
swapping, proposes to create departure pushback slots to enable flight departure 
swapping and prevent overloading the taxiway system. CVQ implementation can 
shorten the average departure taxi time, reduce emissions, provide flexibility for 
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airlines to reorder pushbacks, and increase predictability of wheels-off times by 
decreasing taxiway queuing. Some of the research in this area is in the operational 
and modeling stages. In support of modeling efforts, Borchers and Day (2010) 
documented characteristics of aircraft that are either on or vectored from routes 
while executing RNAV precision departures. Assumed in models is that the aircraft 
arrive at specified locations on their prescribed paths precisely when ATC expects 
them. Modeling simulations, however, may not accurately reflect circumstances 
involving high traffic and poor weather conditions. Next steps for departure and 
arrival studies include current industry and government efforts concerning air- 
ground communication terminology, design improvements, and chart-database 
commonality for arrivals and departures (Barhydt & Adams, 2006). 

Merging and Spacing (M&S) 

NASA research on M&S suggests that airport throughput can be dramatically 
increased over today’s capacity by use of self-separating aircraft pairs (Barmore 
et al., 2009), and investigations from the flight deck perspective focus on the 
feasibility and/or advantages of various implementations (for example, Barmore 
et al., 2009; Baxley, Barmore, Abbott & Capron, 2006). As with other NextGen 
operations, emphasis is being placed on increased pilot involvement in M&S 
procedures. Flight deck-based M&S is a subset of the Airborne Precision Spacing 
concept of operations (Barmore, Abbot, Capron & Baxley, 2008) developed to 
support the precise delivery of aircraft landing sequentially on the same runway. 
The use of new M&S techniques will also be essential in future CDA operations. 
The effect of new M&S procedures such as MIT (Miles in Trail) on flight crew and 
ATC workload has not yet been sufficiently examined. Additionally, procedures 
must take into account the system effects of aircraft-centric flight crew responses 
during off-nominal situations (for example, Ho et al., 2010). 

Runway, surface, and taxi operations 

The TBO conops is also being examined in the context of surface operations (see 
Foyle, Hooey, Bakowski, Williams & Kunkle, 2011). NASA research efforts 
concerning runway, surface, and taxi operations are tied to TBO and precision 
spacing efforts, and have been geared toward increasing runway throughput via 
improved aircraft spacing precision at landing. Some NextGen concepts involve 
flight deck Surface Trajectory-based Operations (STBO), with time and speed 
components in taxi clearances to the departure runway (Foyle, Hooey, Bakowski, 
Williams & Kunkle, 2011), and implemention of a collaborative decision-making 
framework to support more fuel efficient and lower community noise operations 
while maintaining or increasing runway throughput efficiency (Burgain, Pinon, 
Feron, Clark & Georgia, 2009). Suggested changes to improve the efficiency of 
precision taxi operations include a prototype surface automation tool (Ground 
Operations Situation Awareness and Efficiency Tool—GoSAFE; Verma et al., 


54 


Advances in Aviation Psychology 


2010) or use of a CVQ (Burgain, Pinon, Feron, Clark & Georgia, 2009) to shorten 
the average departure taxiing time. 

Some work has assessed the efficacy of synthetic vision systems (SVS) as 
well as head-worn and head-up display concepts (HWD and HUD) for surface 
operations to determine whether greater visibility increases situation awareness 
(for example, Arthur et al., 2008; Hooey & Foyle, 2008). Challenges for these 
displays included some nausea experienced by pilots, as well as issues with 
latency, alignment, comfort ergonomics, color, and other display rendering. 

Other research focuses on technology and safety in surface operations. One 
issue in autonomous taxiing is the uncertainty about intentions of other aircraft 
(Hakkeling-Mesland, Beek, Bussink, Mukler & van Paassen, 2011). The Runway 
Safety Monitor (RSM; Jones & Prinzel, 2007) detects runway incursion conflicts 
and generates alerts in time for the crew to avoid collisions. Its detection algorithm 
has been found to be effective in reducing all types of runway incursions and 
eliminating the most severe incursions. The Runway Incursion Prevention 
System (RIPS; Jones & Prinzel, 2006) has been designed to enhance surface 
situation awareness and provide cockpit alerts of potential runway conflicts in 
order to prevent runway incidents. However, some results indicated that in visual 
meteorological conditions (VMC) most pilots were able to acquire incoming traffic 
looking out the cockpit windows—even before incursion alerting was activated. 
Next steps in this area include refining and implementing an alerting system and 
associated flight deck displays for runway incursions. 

Closely-Spaced and Very-Closely-Spaced Parallel Runway Operations (CSPR 
and VCSPR) 

Even with the increased efficiency enabled by NextGen conops, many airports are 
simply not large enough to handle the expected increase in traffic. With many of 
the nation’s major airports located within cities, expansion is not always an option. 
One proposed solution is to insert additional runways near or between existing 
ones creating (very) closely spaced parallel runways (VCSPR). 

Some airports such as San Francisco International Airport (SFO) currently 
conduct CSPR operations; however paired approaches to these runways are only 
permitted under VMC. When weather conditions degrade, as they often do at SFO, 
the benefit of the extra runway is negated. Thus one of the major topics in VCSPR 
research is achieving VMC performance capabilities in instrument meteorological 
conditions (IMC). Multiple HITL studies have shown that with the advancements 
in vision and conflict detection technology this performance goal is achievable, 
even with the occurrence of off-nominal events such as aircraft incursions (Verma 
et al., 2009). Additionally, analytical models for calculating the ultimate arrival, 
departure, and potential mixed operation capacity of CSPRs have demonstrated 
that their use in all weather conditions can provide stable and predictable arrival 
capacity (Janie, 2008). 
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Approaches to (V)CSPRs would involve pairing one aircraft with another in a 
slightly offset trail position, and another major concern for (V)CSPR operations 
is the potential disturbance caused by the lead aircraft’s wake-vortex (McKissick 
et ah, 2009; Verma et ah, 2009). Multiple efforts have been made to model the 
behavior of these wake-vortices and establish a wake-free safe zone (Guerreiro 
et ah, 2010), and to investigate and predict the behavior of pilots during wake 
intrusions and aircraft deviations toward them (Verma, Lozito, Kozon, Ballinger 
& Resnick, 2008). 

Off-nominal operations 

Off-nominal events pose a significant challenge to NextGen operations particularly 
during approach and landing phases of flight, and have been examined in conceptual 
and model development papers (for example, Burian, 2008), H1TL simulations, 
Monte Carlo and other simulation techniques (for example, Volovoi, Fraccone, 
Colon, Hedrick & Kelley, 2009), and meta-analyses of H1TL studies (for example, 
Gore et ah, 2010; Hooey et ah, 2009). Methods such as Trajectory-Based Route 
Analysis and Control (TRAC; Callentine, 2011) have been used to model off- 
nominal situations and recovery plans. NASA work includes studies looking at 
how pilots handle off-nominal events using enhanced vision systems (EVS) and 
SVS, off-nominal events in M&S and in conjunction with CSPR operations (for 
example, Verma et ah 2008, 2009), and the effects of pilots’ responses to off- 
nominal events in future trajectory based operations (for example. Ho et ah, 2010; 
Hooey et ah, 2009; Prinzel, Kramer & Bailey, 2007; Wickens, Hooey, Gore, Sebok 
& Koenecke, 2009). Of particular importance is the impact of events that occur 
during high-workload and high-traffic phases of flight (for example, approach 
and landing, especially to CSPRs), as they may disrupt CDAs and disrupt airport 
operations. 


Enabling Technologies 

Automatic Dependent Surveillance-Broadcast (ADS-B) 

The operational changes discussed above depend heavily on advanced technology 
for aircraft “seeing,” spacing and safety. ADS-B is a satellite-based surveillance 
technology intended to enable increased capacity and efficiency by supporting 
enhanced visual approaches, (V)CSPR approaches, reduced spacing on final 
approach, reduced separation in other flight phases, surface operations in lower 
visibility conditions, improved situation awareness, improved visibility, and 
reduced environmental impact by allowing controllers to guide aircraft into and 
out of crowded airspace with smaller separation standards than currently possible. 
Many documents in the database include ADS-B capabilities as a variable, but in 
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a secondary capacity rather than as a focus of study. Those that focus specifically 
on ADS-B have typically looked at its impact on conflict detection performance 
in conjunction with other display technologies such as CDT1 (Cockpit Display of 
Traffic Information) or TCAS (Romli, King, Li & Clarke, 2008). In this work, the 
addition of ADS-B provided small improvements over current conflict detection 
technologies, and aircraft performance capability was the main predictor of 
response time rather than the speed or quality of the external data inputs. The 
future version of ADS-B, ADS-x, is predicted to be a key factor in the information 
exchange needed to triple air traffic capacity by 2025 (Harrison, 2006). 

Alerting and Reasoning Management System (ALARMS) 

Implementation of ALARMS will consist of placing advanced sensor technologies 
into the cockpit to convey a large number of potentially complex alerts (Daiker & 
Schnell, 2010). The ALARMS technology will prioritize aircraft sensor alerts in a 
quick and efficient manner, essentially determining when and how to alert the pilot. 
The research thus far has largely focused on the theoretical implications of the new 
ALARMS system and the challenges that will be associated with implementing it, 
as well as on creating Markov and human motor models to test different ALARMS 
scenarios (for example, Alexander et al., 2010; Carlin, Schurr & Marecki, 2010). 

Displays and vision systems 

Many documents in the database address various display configurations, including 
but not limited to cockpit situation displays (CSDs), HUDs, head-down displays 
(HDDs), and HWDs or helmet-mounted displays (HMDs), EVS, SVS, and 
external visions systems (XVS), as well as monocular and biocular displays. These 
new display technologies and configurations of existing display technologies are 
intended to provide increased visibility, symbology, and information for enhanced 
situation awareness and reduced pilot error, improvements in low-visibility 
operations, and overall enhanced pilot performance, particularly in terminal 
operations (for example Arthur et ah, 2011). Display advancements are used to 
investigate Better Than Visual operations and Better Than Visual technologies 
for all-weather capabilities in NextGen such as below-minimum landings (Arthur 
et ah, 2011). Experiments using various display technologies have been geared 
toward identifying pilot perceptions and characterizations of display clutter 
and influences of display clutter on pilot performance (for example, Alexander, 
Stelzer, Kim, Kaber & Prinzel, 2009; Kaber, Alexander, Stelzer, Kim & Hsiang, 
2007; Kim et ah, 2011). Some results indicate there may be a clutter “threshold” 
beyond which pilot performance degrades (Kaber et ah, 2007). This suggests that 
advanced technologies that increase display clutter may be counter-productive, 
pointing to the need to both eliminate clutter and improve the salience of critical 
symbology and information (for example, Naylor, Kaber, Kim, Gil & Pankok, 
2012). Suggested next steps in this area of research include further investigation 
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of new flight deck display technologies, including issues such as readability in 
daylight (color, brightness, contrast), disorientation or nausea, and illusion issues 
(for example, Bailey, Arthur, Prinzel & Kramer, 2007). 

Conflict Detection and Resolution (CD/R) 

NextGen pilots will become increasingly more active in the conflict resolution 
decision-making process, and some research has shown that keeping pilots in the 
decision-making loop enhances their situation awareness in conflict resolution 
tasks (Dao et al., 2011)—although potentially at the expense of increased 
workload (Ligda et ah, 2010). As the amount of air traffic increases, creating 
conflict resolutions that avoid secondary or cascading conflicts is becoming more 
of a challenge. One major research focus for CD/R is improving the algorithms 
of conflict detection tools in order to create more effective vertical and horizontal 
resolutions with fewer secondary conflicts (for example, Maddalon, Butler, 
Munoz & Dowek, 2009). Some research focuses on integrating and comparing 
new technology with current CD/R systems (for example, Romli et ah, 2008) and 
on pilot acceptance of CD/R automation (for example, Battiste et ah, 2008). New 
algorithms have been generally successful at creating more effective and efficient 
conflict resolutions with more accurate predictions of future conflicts and recovery 
from loss of separation (for example, Butler & Munoz, 2009). The next step will 
be to test the new algorithms in more diverse and dynamic environments. 

Haptic control 

Haptic control technology enables a control surface or system to provide tactile 
feedback to the pilot. This additional feedback has been demonstrated to improve 
pilot situation awareness of aircraft state and overall pilot performance (Goodrich, 
Schutte & Williams, 2011). Additionally, pilots seem to prefer a haptic flight 
control system to traditional systems. Tactile alerts have also been shown to elicit 
faster responses than auditory interruptions, unless interruptions are very complex 
and/or urgent (Lu, Wickens, Sarter & Sebok, 2011). 


Human Factors Issues 

Attention 

NextGen operations will pose significant challenges for human factors on the 
flight deck. Flight deck responsibilities for activities such as spacing and CD/R 
are predicted to increase, and pilots will be expected to monitor and attend to 
enhanced technological systems and sources of information. Much of the research 
addressing human attention for NextGen applications concerns noticing and 
perceiving events in a situation. One relevant research product is the human 
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attention model N-SEEV (noticing time = signal salience, effort needed to attend 
to the signal, expectancy of the signal, and value of the signal). The model 
has been shown to successfully predict the effectiveness of different events in 
capturing pilot attention on the flight deck (McCarley et ah, 2009) as well as the 
variance in pilot response to off-nominal situations (for example, McCarley et ah, 
2009). N-SEEV has also been applied to the Man-machine Integration Design and 
Analysis System (MIDAS) and has improved the ability of MIDAS to accurately 
model visual attention over previously used probablistic scan behaviors (Gore, 
Hooey, Wicken & Scott-Nash, 2009). 

Another area of attention research concerns checklist monitoring and checking, 
particularly with respect to factors that cause pilots skip or miss items (for example, 
Dismukes, 2007; Dismukes & Berman, 2010; Dodhia & Dismukes, 2009). One of 
the main causes of checklist errors is interruptions on the flight deck. Dismukes 
and Berman (2010) found that although the base rate of these errors was quite 
low (around 1 percent), the chances that the missed item will be detected was also 
very low (18 percent). Dodhia and Dismukes (2009) found three reasons why 
interruptions cause individuals to forget: (1) interruptions often abruptly divert 
attention, which may prevent adequate encoding on an intention to resume and 
forming an implementation plan; (2) new task demands after an interruption’s end 
reduce opportunity to interpret resumption cues; and (3) the transition after an 
interruption to a new ongoing task demands is not distinctive because it is defined 
conceptually, rather than by a single perceptual cue. 

Additionally, the relationship between attention and pilot engagement has been 
investigated through brain imaging techniques that provide feedback on how much 
attention the pilot is paying to events, as well as the level of fatigue experienced 
(Harrivel, Hylton & Tin, 2010). Using Functional Near Infrared Spectroscopy 
(fNIRS), Harrivel and colleagues created head gear that collects attentional state 
information over long periods of time and can monitor when a pilots’ attention is 
starting to wane. Although this technology is still in the prototype phase it could in 
future be used to help monitor pilot attention. 

Roles and responsibilities 

Implementation of NextGen will entail collaborative decision making between 
air and ground, and some reallocation of roles and responsibilities. With the 
expected increase in air traffic, the potential increase in ATC workload figures to 
be a limiting factor in the number of aircraft the system can handle. One proposed 
solution to this problem is to give flight crews increased responsibility for flight 
paths, especially with respect to spacing and separation from other aircraft. 
Assigning aircraft to self-separation is predicted to be an effective solution to the 
inevitable increase in traffic, particularly when there is some flexibility in flight 
paths (Idris, Shen & Wing, 2010), and can significantly lower ATC workload while 
maintaining an acceptable workload for the flight crews (Johnson et ah, 2010). 
H1TL and computer simulations have demonstrated that self-separation is able to 
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accommodate 2-5x increases in traffic in enroute operations (Prevot et al., 2007; 
Wing et al., 2010). Self-spacing has also demonstrated the capability to increase 
runway throughput and facilitate the use of CDAs through increasing spacing 
precision and arrival accuracy (Kopardekar et al., 2003). Increased efficiency 
due to self-separation also is projected to decrease noise and emissions. A major 
obstacle to the self-separation concept is the accuracy (or lack) of wind forecasts, 
and the lack of realistic wind forecasts was a significant limitation in the existing 
research. Additionally, implementation of self-separation will entail additional 
training, enhanced Crew Resource Management, and tailored procedures within 
the flightdeck (Wing et al., 2010). 

Changes to flightdeck design may also result in changes in the role of the 
pilot. Letsu-Dake, Rogers, Dorneich, and DeMers (2011) focused on “flight decks 
of tomorrow” and came up with two competing ideas of what pilot roles and 
responsibilities could potentially be in the future. The first concept is based on 
a flightdeck with the guiding principle of the “Pilot as Pilot,” which would be 
similar what we have today. The pilot would be responsible for performance of 
all functions and utilize control, navigation, flight management, and surveillance 
automation to ensure that the aircraft meets its tactical and strategic targets safely 
and efficiently. This contrasts with the second version based on the idea of the 
“Pilot as Manager.” In this scenario the main role of the pilot would be to manage, 
monitor, and collaborate with automation to effectively perform flight deck 
tasks. The fundamental assumption underlying this concept is that highly reliable 
automation handles most tasks while the pilot monitors and verifies that high- 
level mission objectives are being met through appropriate flight deck subtasks. 
These two differing ideas of how future flightdecks may function provide very 
different views of the roles and responsibilities of pilots in NextGen. The authors 
concluded it is likely that future flightdecks will fall somewhere in between these 
two extremes. 

In order for decision-making responsibilities to be shared between air and 
ground, information, goals, and priorities will need to be aligned. Ho, Martin, 
Bellissimo &Berson (2009) looked at the potential of the NextGen air transportation 
system to affect pilot and ATC information requirements and sharing. In this 
study they investigated how pilots and controllers defined their own goals. Their 
findings were that pilots and ATC shared the same high-level goals, but that they 
differed in their sub-goals. ATC sub-goals were system-centric, while the pilots’ 
goals were aircraft-centric. Ho and colleagues found that ATC rarely took into 
account the pilots’ sub-goals; however the pilots indicated that they wanted the 
controllers to take into account as much as possible the pilots’ important sub¬ 
goals. Ho (Ho et al., 2010) also looked at goals in the context of off-nominal M&S 
situations, and found that pilots’ responses to weather were difficult to predict 
and did not match behaviors preferred from the ATC system-centric perspective. 
These “mismatches created dispersion in the temporal spacing at the merge point 
prior to the descent, flight path stretches that are likely larger than required, higher 
workload, and ultimately unfavorable initial conditions for the CDA operation 
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subsequent to the weather encounter” (p. 4.D.3-1). The authors suggested that 
maintaining continuous communications and exchanging information is key to 
developing a collaborative relationship and for effective decision making (Ho et 
ah, 2009). 

Operator performance 

Operator performance is a broad area that covers all aspects of pilot behavior and 
encompasses human factors variables such as workload, situation awareness, and 
decision making. A common methodological trend in this area is the modeling of 
pilot performance, particularly with respect to new technologies or off-nominal 
events. NextGen operations such as CDA and technologies such as EVS/SVS or 
other displays typically focus on increasing situation awareness and maintaining 
manageable workload (for a review see Wickens, Sebok, Gore & Hooey, 2012). 
One advance in this topic area has come from the six-year Human Performance 
Modeling project documented in Foyle and Hooey (2008). This project involved 
five different teams applying multiple cognitive models to a common set of 
aviation problems. 

One serious issue that exists with modeling attempts is the extent to which a 
specific model can be considered a valid predictor of pilot performance. There 
have been a variety of approaches to this issue (for example, Cheng et al., 2009; 
Cover & Schnell, 2010; Gil et al., 2009) with no definitive definition of validity. 
One study (Gore et al., 2011) took on the issue of validity with the MIDAS model 
of pilot performance, and focused on increasing model validity through increasing 
input validity. The researchers emphasized the need for an iterative approach that 
calls for constantly updating the input in order to create a model that is a valid 
predictor of pilot performance. In addition, they also called for validating the 
output by comparing it to H1TL simulations as a final check of model validity. 
Using a small sample, Kaber and colleagues (2008) compared their model to pilot 
performance and workload with different levels of automation and found that their 
model successfully predicted pilot heart rate, subjective workload, and vertical 
path deviations. 

Other research under the broad umbrella of operator performance includes 
studies on pilot beliefs. For instance Casner (2009) analyzed the perceived versus 
measured effects on pilot workload and error of four advanced cockpit systems: 
(1) navigation equipment; (2) control methods; (3) flight instruments; and (4) 
navigation instruments. Although pilots enjoyed using all four types of new 
technologies tested, only the GPS and autopilot technology actually reduced pilot 
workload and errors and only in certain phases of flight, suggesting a discrepancy 
between pilot perceptions of how technology enhances performance and its actual 
impact. Research on operator performance has even highlighted the impact of 
crew meals on pilot performance (Barshi & Feldman, 2012). The authors made 
the argument that the elimination of crew meals on many domestic operations 
reduces energy levels, which causes lower performance with respect to higher 
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cognitive functions. Specifically, lower blood sugar is associated with a decline in 
the ability to make quick decisions and the ability to perceive risk. They also note 
that pilots are often unaware of any symptoms, increasing the potential danger of 
low blood sugar. 

The topic of operator performance will become increasingly important 
as researchers attempt to predict pilot behavior under NextGen operational 
conditions. A report by Lee et al. (2010) highlights many of the operator 
performance challenges facing NextGen implementaton.The authors identified 
four research areas (separation assurance, airspace super density operations, traffic 
flow management, and dynamic airspace configuration) with issues that need to 
addressed in order for NextGen operations to function smoothly. Lee et al. (2010) 
provided general recommendations for evaluating possible countermeasures for 
problems: (1) further define the concept; (2) develop prototypes of the tools and 
operational procedures; and (3) evaluate the concept via walkthroughs or H1TL 
simulation. 

Research on operational changes, technologies, and human factors issues 
in NextGen is still ongoing, and implementation plans continue to evolve. The 
documents discussed in this chapter and outlined in the NASA NextGen Flightdeck 
Literature Database provide a snapshot of the flight deck research focus, and 
describe the research areas deemed by NASA to be critical to the NextGen conops. 
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Chapter 5 

Flight Deck Models of Workload and 
Multitasking: An Overview of Validation 

Christopher D. Wickens & Angelia Sebok 
Alion Science and Technology, USA 


The next generation of the airspace program in the US (NextGen) calls for a 
variety of new concepts of operations as well as supporting technologies, such 
as self-separation, data-linked messages, closely spaced parallel operations and 
so forth (FAA, 2013). Naturally it is critical that such features preserve safety, 
even as they will increase efficiency. Traditionally, assessments of safety and 
efficiency of new technology and procedures is accomplished by human-in-the- 
loop (H1TL) simulation, yielding results based on response time, errors, workload, 
and more recently, situation awareness (Strybel, Vu, Battise, & Johnson, 2013). 
However it is also well understood that such H1TL simulations can be extremely 
time consuming, expensive, and often lacking in statistical power because of a 
small sample size, since highly qualified line pilots are often difficult to engage for 
participation in long experiments. 

A complementary approach that we advocate in this chapter is the use of pilot 
or controller computational models of human performance that can be used to 
estimate and predict the level of performance that will be achieved in certain 
conditions (Foyle & Hooey, 2008; Pew & Mavor, 1998; Gray, 2007). To the extent 
that such models are valid (an issue we address extensively in this chapter), they 
can provide satisfactory predictions of performance in a fraction of the time, and 
at a fraction of the cost of full H1TL simulations. 

Review of Pilot Models 

This chapter describes one component of a project we performed as part of a larger 
effort for the FAA to establish the state of the art of computational models of pilot 
(for example, flight deck) performance (Wickens, Sebok, et al., 2013). While the 
larger effort addressed numerous aspects of pilot performance, the more specific 
aspects we emphasize in this chapter are models of multitasking and workload. We 
began the overall effort by searching for published research in the period of 1988- 
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2012 that described (1) computational modeling efforts to predict pilot flight deck 
performance and (2) empirical validation studies to assess the quality of these 
model predictions. We first identified 39 relevant journals, conferences, books, 
and websites where such research might be reported. We then performed keyword- 
based searches to find the specific articles. From this search, we identified 187 
separate modeling and validation efforts. 

Our next task included getting an overview of what the modeling efforts 
addressed. Specifically, we identified 12 aspects of pilot performance that were 
modeled: workload, multitasking, errors, situation awareness, pilot-automation 
interaction, decision making, communication, manual control, procedures, spatial 
disorientation, vision (including visual attention and scanning), and roles and 
responsibilities. 

Each modeling effort identified was characterized in terms of coding criteria 
or features. Through an iterative process of reviewing papers that summarize 
modeling efforts, applying the coding, and identifying problems (categories 
that were not relevant, issues that were not captured in the criteria), the team 
developed a final set of coding features and criteria. These included descriptive 
features such as the type of model (simulation versus equation), aspect of pilot 
performance modeled, and the model name. Another set of criteria addressed 
evaluative features, or ways in which models could be compared. These included 
whether empirical validation studies had been performed, the population from 
which participants were drawn (for example, pilots, college students), the testbed 
on which the experiment was performed (for example, flight simulator or tracking 
simulator), and the types of analyses performed to compare empirical results 
with model predictions. Finally, modeling efforts were assessed in terms of their 
cognitive plausibility, their usability, and their availability. 

Cognitive plausibility refers to whether the elements or terms within the 
model correspond to identifiable cognitive or information processing constructs in 
human psychology. Usability refers to the extent to which the models are relatively 
simple, so that people who are not part of the model development team can readily 
deploy them. Availability refers to whether the model software or equations are 
available to the public. Altogether 12 features were defined to characterize each 
separate research paper that examined a pilot performance model. 

Then we identified the model architecture used for each of the modeling efforts. 
We defined an architecture as a common conceptual framework that is applied 
to develop individual models, or a common tool (such as a specific modeling 
tool, including a computational language, interface, and rules for interaction) for 
implementing these models. An architecture could be a conceptual framework 
alone. One example is the Salience, Expectancy, Effort, and Value (SEEV) 
modeling approach to predict operator visual scanning (Wickens & McCarley, 
2008; Wickens, 2014). SEEV predicts where the operator will look, based on (the 
SEEV) factors of the visual areas of interest (usually displays) in the environment. 
The SEEV framework can be applied to a wide variety of environments (for 
example, flight deck layout, process control panels, submarine displays) and 
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implemented in different types of software tools (for example, as described in 
Steelman, McCarley & Wickens, 2011, 2013). Specific examples of software tools 
include the C# programming language, or different human performance modeling 
environments. 

A model architecture can also be a specific software tool, for example, the 
Improved Performance Research Integration Tool (IMPRINT) human performance 
modeling environment. IMPRINT, and other modeling environments such as the 
Man Machine Integration Design and Analysis System (MIDAS; Gore et ah, 2013 
[5]), include specific implications for the way in which human performance is 
characterized and modeled. For example, IMPRINT models human performance 
as a series of discrete tasks, rather than a set of complex cognitive processes. 
Finally, a modeling architecture could be a combination of software tool and 
conceptual framework (for example, the Atomic Component of Thought— 
Rational, or ACT-R; Anderson & Lebiere, 1998). 

Of the 12 aspects of pilot performance that were addressed in various modeling 
efforts (for example, situation awareness, error), five were identified to be highly 
relevant for predicting pilot performance in NextGen operations: Workload and 
multitasking (combined into 1 area), Pilot-automation interaction, Situation 
awareness, Roles and responsibilities, and Pilot error. These five areas were 
examined in greater detail, through a set of deep-dive analyses that reviewed how 
modeling efforts approached the area of interest and how they attempted to predict 
pilot performance (Wickens, Sebok et ah, 2013). For example, in considering 
pilot-automation interaction, we found that modeling teams typically focused on 
a specific type of flight deck automation (most frequently the flight management 
system or FMS). Some pilot-automation interaction efforts attempted to predict 
pilot cognition, and focused on detection of visual indications and decision making 
in very specific task sequences. Other efforts attempted to compare automation 
designs in terms of their adherence to human factors design principles, or in terms 
of the predicted pilot workload brought about by interacting with the automation. 

The current chapter focuses only on two closely related aspects of pilot 
performance models: workload and multitasking. (For a comprehensive review 
of the other pilot performance aspects, see Wickens Sebok et ah, 2013). The 
reason for this restricted focus is twofold. (1) Workload and multitasking issues 
are of critical importance in the flight deck, as satellite navigation and improved 
sensors are transferring more responsibilities and tasks from ground to the fight 
deck. (2) These areas capture a long-standing theoretical and practical interest 
of the first author, in their applications to aviation (for example, Wickens, Goh, 
Helleberg, Horrey & Talleur, 2003. Wickens, Sandry & Vidulich, 1983; Wickens 
& McCarley, 2008; Wickens 2002, 2008). 

Model Validation 

A model is not ultimately useful unless it can accurately predict pilot performance; 
and the assessment of accuracy of such prediction derives from model validation. 
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The best measure of model validation reflects the ability of a model to accurately 
predict performance (including measures like workload or situation awareness) 
across a set of different flight conditions, so that the differences or variance 
between such conditions is accurately predicted (Wickens, Vincow, Schopper & 
Lincoln, 1997). In Figure 5.1 we provide a hypothetical example of a comparison 
of model prediction (X axis) with empirical data (Y axis). This example includes 
four conditions in which NextGen [N] and Conventional [C] procedures are 
evaluated, together with advanced [A] versus older [O] equipment. Each condition 
indicated by its letter combination (for example, N-A) is plotted on the graph. 


PITL 

measured 

workload 



Figure 5.1 Correlational scatter plot representation of model validation of 
four predicted conditions 

Notes: N & C represent NextGen and conventional procedures respectively. A and O 
represent advanced and older equipment respectively. The regression line suggests a high 
correlation (r > 0.90) and hence strong validity. 

The ability of the model to predict actual workload is best captured by the 
product-moment correlation (r) between model predictions and Pilot in the loop 
(PITL) simulation data. While r describes the overall success of the model, the 
graphic scatter plot (for example, Figure 5.1) provides additional information 
concerning which effects may be effectively or poorly predicted by the model. 
For example, Figure 5.1 suggests that the benefits (in terms of reduced workload) 
of advanced over older equipment are under-predicted by the model. That is, 
the model predicts little benefit (reduced workload) of the advanced equipment, 
but substantial benefits were observed in the PITL data. Also the model does a 
good job of predicting the much greater relative benefit of NextGen procedures 
than of NextGen (advanced) equipment. The large difference between the two N 
conditions (lower left), compared with the two C conditions, show the predicted 
and empirical differences between the two types of procedures. The model 
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predicted a large drop in workload for N conditions compared with C conditions 
(upper right) and the empirical data revealed those same findings. 

Finally, we note that, while the correlation coefficient should be the benchmark 
or gold standard of validation, other validation efforts short of this may still 
provide useful information. In the following, we do not discriminate between 
these different levels of validation, (but see Wickens, Sebok et al., 2013 for a more 
detailed discussion). However we strongly recommend against using t-tests of 
differences between model prediction and obtained data as a validation measure. 
This is because “not statistically significant” does not imply that there is no 
difference between the two means; and there are reasons other than good model 
fit, particularly low statistical power in the experimental design, that may account 
for the non-significant differences. 

Workload and Multitasking 

The concepts of workload and multitasking are closely related in that both describe 
the limits of the pilot’s information processing capacity, but they are also distinct 
(Wickens & McCarley, 2008). Mental workload generically concerns the relation 
between the total demands on that capacity imposed by single or by multiple 
tasks, and the availability of cognitive resources to meet that demand. While 
higher workload may often diminish performance, it does not necessarily do so, if 
demands do not exceed operator capacity. Hence effective measures of workload 
are often found, not in performance measures, but in physiological or subjective 
measures (Yeh & Wickens, 1988; Wickens & Tsang, 2014). 

In contrast, while performance will depend partially on the relationship 
between resources available and demanded (but only when the demands exceed 
the available resources), there are other factors that may degrade or improve 
multitask performance, particularly when such performance depends on different 
(multiple) resources within human information processing (Wickens, 2008). For 
example, a pilot who must simultaneously activate two controls for two different 
tasks with the same hand may not experience high workload for either task; but 
dual task performance of the tasks depending on the two controls will suffer. 

Because of these different definitions of multitasking and workload, our 
review of validation pays close attention to the measures predicted by models of 
multitasking and workload, and the measures assessed for validation. These issues 
are typically more straightforward for multitasking, because both the predictions 
and assessments are those of the difference in dual task performance decrements 
predicted by the different model factors (for example, voice versus visual data 
link, or NextGen versus conventional procedures). 

In contrast, model predicted “workload” is not always well or explicitly defined 
in these papers, and the assessment of workload often reflects the multiplicity of 
techniques used (Wickens & Tsang, 2014). That is, workload may be defined as 
secondary tasks which tap reserve capacity when workload is not excessive, or as 
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performance decrements when workload is excessive, and as either subjective or 
physiological measures across all levels of workload demand (Vidulich & Tsang, 
2012 ). 


Results: Workload and Multitasking Model Validation 

Altogether 23 modeling efforts were identified that focused on either the workload 
or multitasking aspect (some of these may have also focused on other aspects as 
well), and most of these contained empirical P1TL data in their write up (although 
these data were often not adequate to be considered true validation). We represent 
these efforts in the context of Table 5.1 which represents the multi-dimensional 
array of model types. Table 5.1 contains three columns. Each column is arrayed 
on a continuum of increasing model complexity or sophistication from bottom to 
top. The dimension of complexity is separate for single resource models of effort 
(left column) and of time (center column) and for multiple resource models (right 
column). However here we note that greater complexity in terms of what the model 
addresses typically requires greater sophistication, specialization, and training of 
the model user or analyst. This complexity often makes the model less ubiquitous 
and usable, even as it might also avail greater precision of model predictions, 
and hence increase its validity. We also note that a given model may populate 
more than one column in the table, being for example, simple on one column 
(for example, a simple representation of time scheduling), but more complex on 
another (for example, invoking more complex multiple rather than less complex 
single resources). 

Within each column, each number represents a unique modeling effort, whose 
identity can be found in part A of the reference list. Each effort (number) is in turn 
associated with two additional attributes. We have tried to classify (1) the extent 
to which the modeling effort is validated or not, characterized by the underline of 
the ID number, and (2) the extent to which we consider the modeling effort carried 
out on tasks or within a context that may be considered high fidelity or validity 
to transport aircraft (for example, high cockpit automation, use of line pilots in 
validation data) or not, characterized by the bracketed bold face coding for higher 
fidelity. Thus a modeling effort coded by both attributes (for example, 67) would 
be considered particularly valuable for predicting NextGen performance. We note 
that some numbers appear more than once; for example when H1TL data from a 
given study was validated against two (or more) models. 

At the base/foundation of the table, we consider the simplest view of attention 
or processing capacity as a limited single resource. At this fundamental level, there 
are two different conceptions as to what this resource may be. On the left, it is 
considered to be a limited “pool” of processing effort (Kahneman, 1973), that has 
a physiological basis in brain metabolism (Parasuraman & Rizzo, 2007; Wickens, 
Hollands et ah, 2013, Chapter 11), and can be assessed by physiological measures 
such as heart rate variability, or by subjective rating scales (Rikard & Levison, 
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Table 5.1 Three aspects of model complexity increasing from bottom to 
top 


Single Resources Required _ Multiple Resources 


Effort 

Time 

Required 

Demand-resource 
interaction [14] 

Weighted additive conflict 
between resources [5,14, 
15,22, 23] 

Computational model of 
cognitive complexity [1, 
11,18] 

Fonnal models (e.g., 
ACT-R) [1, 5, 12, 15, 

16, 17] 

“VACP” (Overload/channel, 
e.g., >5) [4, 9,15] 


Simple queuing 
models [10, 21] 

“VACP” summed over 
channels [4, 15] 

Effort (e.g., SWAT, 

NASA TLX, Bedford) 

Pilot assigned or SME 
assigned [8, 13] 

Task overlap [7, 15, 

20] 

VACP table lookup [2, 3, 4, 
19, 20] 

Time (time required/ 
time available), TLAP 
[11,20] 


Note. Each model effort is coded by a number, whose full citation is found in part A of the 
reference list at the end of the chapter. Bold font numbers indicates high fidelity experiments 
were performed. Underlined font numbers indicates that the model has been validated. 


1978). Thus for example, workload may be excessive if handling qualities of an 
iced or partially disabled aircraft are predicted to be unstable. 

In the center column, the single limited resource may be considered to be time, 
and hence workload may be characterized by the ratio of [time required] to [time 
available]; and performance breakdowns can be related exclusively to the extent to 
which the time required exceeds the time available. Thus, for example, workload 
may be considered excessive if the time required to perform a checklist is less than 
the time available before the next checklist must be initiated. We now describe the 
increasing complexity (from bottom to top) within each of these columns in more 
detail. 

Single Resources: Effort 

Single resource models, where resources are based on the concept of effort may, 
on the one hand, simply have as inputs, pilot or a subject matter expert (SME) 
assigned values of the effort of certain tasks (for example, a Subjective Workload 
Assessment Technique (SWAT) or a NASA Task Loading indeX (NASA-TLX) 
rating). However these become somewhat circular to the extent that model 
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validation is itself based upon a subjective workload assessment measure. More 
valuable, but more complex, are resource models in which predicted workload is 
computed on the basis of some objective computational algorithm of cognitive, 
perceptual or motor complexity (for example, Boag, Neal, Loft & Halford, 2006; 
Manton & Hughes, 1990 [9]) directly related to mental workload. For example in 
Sebok et al. (2012 [18]), a model of FMS complexity is derived, based on number 
of elements (modes) and interrelationship between them. 

As examples of complexity models, Parks and Boucek (1989 [11]) propose a 
model of aviation display cognitive complexity, based upon the formal information 
content of the display; that is, the number of possible states that a given display 
can assume; where two states represent one BIT of information, and four states 
represent two bits. Quite promising is a model of relational complexity developed 
by Halford, Wilson and Phillips (1998) which has been applied to ATC conflict and 
traffic understanding (Boag et al., 2006); but not yet applied to pilot operations. A 
“relation” here is defined when the level of one variable is dependent on the level 
of another. For example, the parameters of one mode setting in cockpit automation 
might be dependent on the level of another (for example, a heading mode depends 
on whether an aircraft is ascending or descending). Two variables interacting in 
this fashion would define a relational complexity of two. 

Single Resources: Time 

When time is considered the single resource for which all tasks compete, and tasks 
must be performed sequentially, modeling efforts have varied in the complexity of 
the treatment of predicted time allocation between tasks (see center column. Table 
5.1). At the simplest level, Parks and Boucek [11] have developed an aviation 
time-line-analysis procedure (TLAP) that simply tallies the total time demand of 
all tasks within an interval and divides this total by interval length to compute 
a predicted workload level. Such a technique can be made a bit more complex 
(and accurate) to the extent that amplified penalties are assigned proportional to 
the time that two tasks must make demands for the same ( overlapping ) period of 
time, such as when a pilot must lower a landing gear while attending to an ATC 
communication, and cannot postpone one or the other. 

A layer of complexity is then added to the extent that time models depend 
on formal queuing theory for their scheduling of tasks [for example, 21], and 
penalties are assigned for tasks that must wait until they are “served” by the single 
server queue (the pilot). Finally, at what we consider the highest level of time- 
modeled complexity, are those models that depend upon a formal well-validated 
(often in non-aviation domains) architecture, such as ACT-R, or Goals Operators 
Methods Selection (GOMS) techniques, or a derivative thereof (Gil & Kaber, 
2012 [1]). Such models are often applied in aviation to predict other aspects than 
workload, such as procedural errors (see Wickens, Sebok et al., 2013). These 
models typically include some time-management routine that will dictate the 
sequence in which tasks are performed, and the period in which a task may be 
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“neglected.” Such periods of task neglect provide an implicit measure of multiple 
task performance breakdown. 

Midtiple Resources 

While single resource models may be extended in complexity either in the 
direction of effort or time, the third dimension of complexity is to consider that 
resources are multifaceted. Depicted in the right column, the pilot’s information 
processing system contains separate resources such as visual versus auditory 
perception or vocal versus manual responses. To the extent that two tasks rely 
upon separate resources, they will be more successfully time shared (although 
will not necessarily lower workload; Wickens, 2008). Hence multiple resource 
theory applies more to multitasking than to workload prediction. Thus the pilot 
will be more likely to hear an auditory warning signal if she is simultaneously 
reading a datalink message (visual) than hearing an auditory ATC communication. 
Multiple resource models are typically used to predict multiple task performance 
decrements (for example, Sarno & Wickens, 1995 [15]), but have sometimes been 
used to predict subjective measures (See & Vidulich, 1998 [19]) or both (Wickens 
et ah, 1988, 1989 [22, 23]). 

The identity of these multiple resources is typically based on a fundamental 
model developed by Wickens (1984) that defines resources along four categorial 
dimensions (See Wickens, Hollands, Banbury & Parasuraman, 2013), each 
dimension having two (or in one case, three) categorical levels. (1) perceptual/ 
cognitive versus response stages; (2) verbal linguistic versus spatial processes; 
(3) visual versus auditory versus tactile perception; (4) focal vision (for example, 
reading text and symbols) versus ambient vision (for example, processing motion 
and flow fields). Because these separate resources were originally assigned 
to four categories by early model developers (visual, auditory, cognitive and 
psychomotor; Aldrich, Szabo & Bierbaum, 1989), the multiple resource approach 
is often referred to as the “VACP approach.” However, the actual complexity of 
what defines resources is often increased beyond four categories and may vary 
slightly between different applications (see text below). 

Fundamental to all such multiple resource models is that tasks are assigned by a 
SME, by the modeling analyst, or by “table lookup” to one or more resource types 
or resource “channels” (for example, within the VACP model, comprehending 
ATC instructions will be considered an auditory-cognitive [“A-C”] task; manual 
flight control will be [V-M]; and monitoring an autopilot-guided approach will be 
[V]). Then within each resource, a demand level is selected, often on a 1-6 scale. 
For example, in vision, detect = 1.0, track/follow = 4.4. These, and other demand 
values, may be found in Laughery et al. (2012). 

Some models stop at this point, calculating demands along individual channels. 
Other models create a simple workload prediction by summing these values 
across channels. At an even greater level of complexity, some models examine 
the extent to which any single channel is “overloaded” (for example, cognitive 
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value >5) by the sum of all task demands within that resource. Such overload 
predicts a multiple task performance breakdown, because if any single resource 
is overloaded, performance will falter: “the chain is only as strong as its weakest 
link” [4]. Finally, most faithful to the original multiple resource models (see Sarno 
& Wickens, 1995 [15]) are those that assign weighted penalties to task pairs to the 
extent that they share more common levels on each dimension within the multiple 
resource space (Wickens, 2008). Thus, for example, two linguistic perceptual tasks 
(for example, reading instructions and listening to ATC) will create more conflict 
(and thus a higher predicted multitasking penalty) than a spatial and a linguistic 
perceptual task (for example, scanning for traffic and listening to ATC) even as 
both task pairings predict some decrement because they both demand access to 
perception. These penalties are calculated by summing the conflict coefficients for 
each pair of resources jointly demanded by the two tasks. Values of such conflict 
coefficients are available in more detailed renderings of the model (for example, 
Wickens, 2005). 

Finally, at the highest level of complexity (Riley et ah, 1991 [14]; Sarno & 
Wickens, 1995 [15]) multiple resource models may compute decrements by 
weighting the conflict by another factor which is the demand of each task within its 
channel. Thus, for example, the conflict coefficient between a visual and auditory 
perceptual task would be elevated if either or both of these were more demanding. 
Flowever both Riley et al. (1991 [14]) and our own research (Wickens & Tsang, 
2014) concur that this highest level of complexity is not necessary; and only serves 
to decrease the stability of the conflict values. Workload and interference models 
do just as good a job in prediction if conflict coefficients are stable, across tasks 
of all different difficulties, and a separate demand component then sums the total 
task demands (for both tasks) across all different resource types. These two meta¬ 
components can then be equally weighted and summed, to derive a total predicted 
workload measure. 

An example that illustrates both the different levels of model complexity, and 
the process of validation is provided by Sarno & Wickens (1995 [15]). Participants 
in their study performed a simulated 1LS (instrument landing system) tracking 
approach concurrently with a simulated peripheral visual monitoring task, and 
either of two aviation decision tasks; one involving spatial localization of a 
combat aircraft, and the other a verbal fuel computational problem. Each of these 
tasks were presented with either visual or auditory display, responded by either 
voice or hand, and could be either easy or difficult in their cognitive aspects. 
Flence altogether 16 different data points of task interference between tracking 
and decision making could be derived. 

Three computational models of increasing complexity were implemented, 
based upon task time line analysis, workload analysis using the McCracken 
and Aldrich VACP scale values, and upon analyses of tasks by their resource 
composition. One model was based on pure time line analysis; and predicted 
penalties to the extent that tasks overlapped in time. The second was demand 
based, and predicted penalties based on the sum of resource demand scores across 
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channels of concurrently performed tasks. The third invoked the most complex 
form of weightings for channel conflicts, as described above. All model predictions 
were correlated against obtained measures of interference between the decision 
and the tracking task; and these correlations were reasonably high, in the range of 
0.75 to 0.87. The three models were then compared in terms of different features 
that contributed to their prediction, and the authors concluded that including 
separate demand levels for tasks along resource channels contributed little to 
the predictive ability, but including the weighted conflict matrix component 
contributed an added 9 percent of the performance variance accounted for by the 
model predictions. Similar conclusions were drawn when the three models were 
applied to an independent set of data from Wickens et al. (1988, 1989 [22, 23]. 
Importantly, the latter two studies found models incorporating multiple resource 
conflict were better predictors of task interference, while those incorporating only 
demand were better predictors of subjective workload. 


Discussion and Conclusions: Validity, Validation, and Complexity 

Tallying across the two “validation” codes assigned in Table 5.1, we reach the 
conclusion that 15 of the 23 model efforts have been validated; but of those 15, 
only four appear to be carried out in what might be described as a high fidelity 
context. Correspondingly, of the 14 high fidelity studies, only four report careful 
validation. The reader should be aware that we have defined these two criteria 
liberally (for example, including validation efforts in which correlations were not 
computed, as shown in Figure 5.1). If we were to restrict “high fidelity” to include 
only true NextGen concepts, and “validation” to require use of correlations, this 
number would be reduced substantially. 

We also note the scarcity of high fidelity validation toward the top of the 
table, where complex models predominate. Indeed two of these four only address 
cognitive complexity of component aspects of the flight deck. We account 
for this state of affairs in more detail in Wickens, Sebok et al. (2013), but we 
are well aware of the difficulty of accomplishing full validations of complex 
models, particularly with the limited financial resources often made available 
for modeling efforts. However in this regard, we note and emphasize that every 
single modeling effort need not be validated. Once a model architecture is 
validated in one context, empirical support has been provided to suggest that 
its un-validated predictions will nevertheless be accurate in a different context. 
This allows for reducing the time required to assess the viability of NextGen 
technology and procedures through model-based efforts. Another finding of this 
research is that a variety of software and modeling architectures are available 
for developing models of pilot workload. Workload can be—and has been— 
modeled as task loading, subjective effort, time demands, operator resource 
demands, and conflicts among operator resource demands. While additional 
validation efforts are warranted, these modeling techniques provide useful ways 
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of comparing different flight deck designs and different concepts of operation 
for use in NextGen airspace. 
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Chapter 6 

Clarifying Cognitive Complexity and 
Controller Strategies in Disturbed Inbound 
Peak Air Traffic Control Operations 

Marian J. Schuver-van Blanken 
Air Traffic Control the Netherlands, The Netherlands 


Within a wide range of operational situations, ATC Controllers are able to ensure safety 
while keeping efficiency and environmental impact in optimal balance. Especially 
in managing operational disturbances and unpredictable events, characterized by 
high complexity and dynamics, ATC Controller expertise is crucial (Redding, Ryder, 
Seamster, Purcell & Cannon, 1991; Schuver-van Blanken, Huisman & Roerdink, 
2010). The high level of required ATC Controller expertise is primarily caused by the 
cognitive complexity that is inherent in managing disturbed ATC operation. Further, 
ATC Controller expertise is characterized by strategies to mitigate this cognitive 
complexity and to ensure safety and efficiency in day-to-day operational disturbances. 

Knowing what factors contribute to cognitive complexity in disturbed 
operation and which strategies are underlying controller expertise, can yield the 
necessary key to both reduce cognitive complexity in ATC system design as well 
as improve training for acquiring ATC Controller expertise. This chapter describes 
results of research into cognitive complexity and ATC Controller strategies in the 
operational field of disturbed inbound peak operation at ATC the Netherlands. 
The chapter starts with the ATC Controller Cognitive Process and Operational 
Situation (ACoPOS) model that was used as a framework for the research. Using 
the ACoPOS model, the factors contributing to cognitive complexity are clarified. 
Next, the strategies used by expert ATC Controllers are described. 


The Air Traffic Controller Cognitive Process & Operational Situation 
Model (ACoPOS) 

The cognitive nature of the ATC task performed in its dynamic environment 
results in difficulty of controlling an air traffic situation, which is referred to as 
cognitive complexity (Histon & Hansman, 2008). To systematically analyze and 
clarify the factors determining cognitive complexity in ATC, the ACoPOS model 
was developed at ATC the Netherlands (LVNL) (see Figure 6.1) (Schuver-van 
Blanken et al., 2010). 
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The ACoPOS model was developed based on a literature review on ATC 
complexity, the models of Endsley (1995), Histon and Hansman (2008) and Oprins, 
Burggraaff & van Weerdenburg (2006). Distinguished in the ACoPOS model are 
cognitive processes (right-hand side of the model) and the operational situation 
(left-hand side of the model). The model provides insight in the relation between 
factors in the operational ATC situation and their impact on ATC Controller 
cognitive processes. By doing this, cognitive complexity issues experienced by 
an ATC Controller can be pinpointed and clarified within the context of a given 
operational situation in which the tasks are performed. The model will be further 
explained in the following sections. 
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Figure 6.1 The ATC Controller Cognitive Process & Operational 
Situation model (ACoPOS model) 

Source: Adapted from Schuver-van Blanken, Huisman & Roerdink, 2010 

Operational air traffic control (ATC) situation 

The elements in the operational situation result (either temporarily or permanently) 
in a particular cognitive complexity. Based on studies on ATC complexity factors 
(Hilburn, 2004; Histon & Hansman, 2008; Mogford, Guttman, Morrow & 
Kopardekar, 1995), the following elements can be distinguished: 
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Safety, efficiency and environment 

In ATC, requirements in terms of safety, efficiency and environment that have to 
be met and continuously balanced. While safety is kept as the highest priority, the 
importance of efficiency is increasing as more aircraft have to be handled in less 
time and with more punctuality. In addition, stringent environmental constraints 
apply, especially for noise abatement purposes. 

Strategic traffic situation 

The configuration of the strategic traffic situation and the operational changes in 
the strategic situation set the physical framework within which traffic has to be 
handled. Complexity in the strategic traffic situation originates from the airspace 
route structure, the sector geometry, the airport layout and runway configuration, 
the amount of traffic and traffic peaks as well as the accuracy of flight planning 
and flow management. 

Tactical traffic situation 

The tactical traffic situation comprises the actual traffic situation and is characterized 
by the dynamic nature of the situation, continuous changes in the information and 
factors interacting with each other. Since this makes the tactical situation hard to 
pre-specify, it can result in unpredictable or unexpected situations, which can be 
cognitively demanding. Next to the actual traffic positions resulting in possible 
conflicts, complexity is determined by the interaction between inbound, outbound 
and crossing traffic flows, the performance characteristics of aircraft, the diversity 
in traffic, the severity of weather conditions, and the presence of emergencies or 
exceptional situations. 

Team 

The amount of information sharing, interaction, and coordination with other team 
members (being other ATC Controllers, supervisors, assistants, adjacent centers, 
pilots and airport authorities) comprise an important part of the ATC task and can 
increase or decrease cognitive complexity. In addition, ATC Controllers work in 
different team configurations, due to the fact that working positions are combined 
and decombined, depending on the traffic situation and operation. 

Procedures 

Procedures comprise standard operating procedures and rules for traffic handling 
as well as working methods. Cognitive complexity is influenced by the number of 
procedures, the complexity of procedures, possible ambiguity in procedures and 
diversity in working methods. 

Technical systems 

The design of systems used for communication, navigation, surveillance, and 
decision support can heavily impact complexity: a mismatch between system 
characteristics and human information processing can increase cognitive 


88 


Advances in Aviation Psychology 


complexity, for instance because the cognitive process cannot be performed 
intuitively. 

Cognitive processes in air traffic control (ATC) 

Based on existing ATC cognitive models (Endsley, 1995, Histon & Hansman, 2008, 
Oprins et al., 2006), three main categories of cognitive processes are included in 
the ACoPOS model, which are translated into actions. 

Situation assessment 

Situation assessment is a cognitively demanding process due to the dynamic and 
interactive nature of the elements in the operational situation. Situation assessment 
involves perceiving the information, interpreting the actual situation into a 
complete picture, and anticipating or projecting how the situation will evolve. 

Problem solving and decision making 

Problem solving and decision making concerns the process that ATC Controllers 
use to handle air traffic safely and efficiently within environmental constraints 
according to procedures. In a complex and dynamic environment such as ATC, 
ATC Controllers continuously solve (potential) problem- or conflict-situations, 
formulate and adjust plans for traffic handling and decide on what course of action 
to take. 

Attention management and workload management 

Attention and workload management ensure that the ATC Controller’s limited 
attention capacity and working memory capacity is optimally employed and 
adapted to the needs in the specific situation. This includes setting priorities, 
directing attention to specific situations and information, while also dividing 
attention systematically to keep overview over the situation. 

Actions 

The outcome of the cognitive processes results in actions executed by the ATC 
Controller to interact with the operational environment and implement plans and 
solutions. These actions include communication (including radio/telephony (RT)), 
coordination, teamwork, and the use of operational systems). 


Clarifying Cognitive Complexity in Disturbed Air Traffic Control 
(ATC) Operation 

Day-to-day ATC is frequently characterized by disturbances as a result of 
dynamic factors in the situation, unpredictable events, or complex situations. The 
ATC Controller’s proficiency lies especially in effectively managing operational 
disturbances and unpredictable events and fluctuations (for example, Redding et 
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al., 1991; Schuver-van Blanket! et al., 2010). Disturbed operation can be seen 
on a continuum between standard and exceptional operation. Standard operation 
comprises routine or standard traffic handling, whilst exceptional operation 
comprises extreme (non-routine) situations in which safety is maintained but 
efficiency of traffic handling is affected. Disturbed operation requires adapted 
traffic handling, in which safety is ensured and efficiency is maintained at an 
optimal level. In disturbed operation, traffic handling and working methods 
have to be (temporarily) adapted for a shorter or longer period to mitigate the 
disturbance (Schuver-van Blanken & Van Merrienboer, 2012). It is especially 
in disturbed operation in which ATC Controllers report experiencing increased 
cognitive complexity. 

Handling day-to-day disturbances in inbound peak operation at ATC 
the Netherlands 

Aprototypical example ofhandling day-to-day disturbances at ATC the Netherlands 
is inbound peak operation for Schiphol Airport. In the current inbound operation at 
Schiphol Airport, Amsterdam, inbound traffic is initially controlled by Amsterdam 
Area Control Centre controllers (AMS ACC), who deliver inbound traffic at the 
initial approach fixes (IAFs). From the IAFs, tactical vectoring is applied by 
the Schiphol Approach controllers (SPL APP) to guide traffic to the runways in 
the Terminal Maneuvering Area (TMA). In handling air traffic in inbound peak 
operation, traffic flows need to be merged and sequenced for the runways in use. 
Prototypical inbound peak operation is characterized by day-to-day operational 
disturbances and unpredictable events that have to be handled. 

Figure 6.2 provides the factors constituting the operational situation and 
cognitive complexity issues experienced by radar controllers at ATC the 
Netherlands during inbound peak operation. Sources by means of which cognitive 
complexity issues were identified are human factors analyses, the bottlenecks 
found in acquiring ATC Controller expertise in ATC training at LVNL (Oprins, 
2008), as well as interviews and group discussions with experienced Area and 
Approach radar controllers at LVNL. Combining these insights in the ACoPOS 
model, the operational factors and cognitive complexity issues in handling day-to- 
day disturbances in inbound peak operation have been identified. 

Strategic traffic situation 

For inbound peak operations a maximum of two inbound (landing) and one 
outbound (takeoff) runway are used. Further, the available airspace is relatively 
small and traffic volumes in inbound peak are high and dense. The number of 
converging runways, the traffic volume in inbound peak operation, as well 
as the runway configuration and changes in operating mode due to weather or 
environmental conditions lead to an inherent complexity in handling air traffic. 
Planned or unplanned adaptations or changes in the strategic traffic situation 
occur frequently during inbound peak operations, causing an increase in cognitive 
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Figure 6.2 Key factors determining cognitive complexity in day-to-day disturbances in the inbound peak at ATC 
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complexity. A switch in operating mode during peak operations requires adaptation 
or rebuilding of inbound and outbound traffic flows or rebalancing of traffic over 
the available runways to accommodate the switch. In addition, traffic might also 
have to hold in holding areas (stacks). 

Tactical traffic situation 

Typically, inbound peak is characterized by bunches of aircraft arriving at the same 
time. In addition, area control as well as approach control not only handle inbound 
and outbound aircraft to or from Schiphol airport, but also crossing traffic with 
a destination in an adjacent center or sector or traffic from/to regional airports. 
Crossing and regional traffic interfere with inbound and outbound traffic flows, 
causing complex conflict situations with limited maneuvering space and a need 
for deviations from standard arrival routings. Further, variations in aircraft (a/c) 
performance, especially slow climbing (outbound) traffic impact inbound traffic 
flows such that standard traffic routings can (temporarily) not be used. Especially 
due to traffic density and available capacity, traffic delays may occur which need to 
be accommodated in traffic flow routings, speed control, or holding as time-based 
operation applies in inbound peak operations. In addition, the wind- and weather- 
situation (such as heavy rain, wind, reduced visibility) impacts traffic handling and 
traffic flows and can result in traffic delays. In addition, accommodation of pilot 
requests increases during disturbances due to weather and wind. 

Teamwork and interaction 

Especially in disturbed peak operations, the importance of teamwork and interaction 
with other team members (including planner and supervisor) within the ATC 
center as well as outside the center, including pilots, becomes critical to success. 
As a result of workload or increasing complexity, executive functions may be split 
into multiple working positions. For example, when prolonged holding situations 
occur, a separate stack controller is assigned. Additional factors in teamwork and 
interaction also comprise when (the timing) and how (the setting) traffic is handed 
over from the adjacent sector (for example, late transfer of aircraft, non-standard 
transfer of aircraft) and the timing of pilot actions and adequacy of response (for 
example, late reaction, aircraft turns later than expected, aircraft descends slower 
than expected). Within the team, the planner controller can pro-actively mitigate 
potential problem situations by influencing or re-planning traffic while the aircraft 
is still in the previous sector. Moreover, the supervisor decides amongst others on 
the operating mode and when and how to combine or split sectors and working 
positions. This also influences complexity. 

Procedures 

Different runway configurations (parallel, converging, strong wind) lead to 
different procedures that apply. Moreover, procedures vary with the time of the 
day, for example, the morning inbound peak coincides with the shift between 
night and day operation (determined by clocktime) and, dependent on the time 
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of the year, the beginning of the daylight period. Also, different procedures exist 
for these variations. For example, runway use is more restricted during the night 
relative to daylight conditions. For switching between operating modes or for 
handling disturbances and unexpected situations there are no standard procedures 
possible. Thus, safety and efficiency depend on the ATC Controller’s proficiency 
in adapting to the specific contingencies. 

Technical systems 

In inbound peak operation, high amounts of information, updates, and alerts are 
available for the controller. In addition, this information is distributed among 
a diversity of visual and auditory information sources (for example, label 
information, radar screen, planning lists, weather information systems, radio/ 
telephony communication (R/T) and oral communication with ATC Controllers 
in the team). Limited advanced decision support tools are available in current 
systems to combine the information or to provide insight into the situation. Due to 
the inbound peak operation, R/T frequencies might congest due to the number of 
messages with high traffic volume. 

Cognitive complexity’ issues in day-to-day disturbances in inbound 
peak operation 

The following complexity issues can be identified related to handling day-to-day 
disturbances in inbound peak situation: 

Situation assessment 

Inbound peak operation, disturbing events, and operational fluctuations result in a 
high amount of change in the operational situation (tactical, strategic, and team). The 
complexity of situation assessment is created by frequent and continuous changes 
and updates in the information that is perceived or needs to be found. In addition, the 
combination of information from the different sources to construct a mental picture 
of the current situation is effortful. Further, the complexity of situation assessment 
is determined by the anticipation of emerging deviations between the actual and 
planned situation and assessing uncertainties about how the situation could unfold. 
These complexity issues lead to the continuous adaptation and rebuilding of the 
mental picture, which demands cognitive resources due to the number of variables 
that need to be kept up to date in working memory. In addition, it is complex due to 
the number of variables, deviations, and uncertainties that need to be combined to 
construct a meaningful mental picture with relevant information. This is referred to 
as a “working mental model” (Histon & Hansman, 2008), which is needed to create 
an overview of the emerging situation and to stay in control. 

Attention management and workload management 

The need for systematic scanning of the operational situation without being 
distracted by events requires the ability to focus on specific situations, but also 
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to divide attention to multiple situations. Intensive scanning is needed, due to 
the number of changes, high traffic volume and density. In addition, variations in 
workload occur due to disturbing events or operational fluctuations, which require 
the controllers to accelerate their task performance to stay in control. 

Problem solving and decision making 

In a typical inbound peak, the de-confliction of bunches of traffic creates cognitive 
complexity. As traffic density and traffic volume is high within dense airspace, limited 
solution space is available for the ATC Controller to solve conflicts and reroute and 
maneuver aircraft. Therefore, complex problem solving is needed to be able to find 
immediate solutions, to route aircraft, and to create space. Standard solutions and 
routing (for example, speed and altitudes used) are applied, but switching to non¬ 
routine traffic handling is often required, increasing cognitive complexity. In doing 
this, the ATC Controller has to adapt the traffic handling plan to the new situation and 
has to continuously create alternative plans, to avoid the risk of working reactively 
and eventually losing control of the situation. In addition, the ATC Controller has to 
switch between routine tasks and non-routine (more effortful) problem solving. ATC 
is characterized by timely decision making, which is time-critical in inbound peak 
operation and requires prioritizing of tasks and actions. 

Actions 

A busy inbound peak situation is typically characterized by a high RT load and a 
high number of system input actions. In addition, intensive information sharing 
and teaming between team members within and outside the sector occurs to pro¬ 
actively stay in control. This also sets an increase in coordination, especially for 
non-standard traffic handling situations. 


Air Traffic Controller Strategies in Disturbed Air Traffic Control (ATC) 
Inbound Peak Operation 

To cope with cognitive complexity in disturbed operation, controllers continuously 
use strategies to adapt their task performance in response to the characteristics 
and dynamics of the operational situation and constraints. Strategies reduce the 
likelihood of overall task performance being compromised (Histon & Hansman, 
2008; Loft Loft, Sanderson, Neal & Mooij, 2007; Malakis, Kontogiannis & 
Kirwan, 2010; Mogford et al., 1995; Nunes & Mogford, 2003). 

Air Traffic Controller (ATC) strategies—overview from literature and research 

Strategies involved in ATC available from literature and research can be categorized 
into the cognitive processes involved in ATC as distinguished in the ACoPOS 
model as is displayed in Table 6.1 (Schuver-van Blanken & van Merrienboer, 
2012 ). 
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Table 6.1 List of ATC Controller strategies derived from literature 

categorized into ATC Controller cognitive processes 


Situation Assessment—Perception strategies 

• Selectively extract a/c data for conflict detection (altitude 1st, heading 2nd, speed 

3rd) 1 - 2 ' 3 ' 4 _ 

Situation Assessment—Interpretation strategies 

• Grouping aircraft into categories or groups (for example, standard/non-standard, same 
characteristics) 6 ' 12 

• Grouping aircraft into standard or non-standard streams or flows of traffic 1 ' 6 - 7 ' 8 ' 1011 

• Identifying critical points or hotspots (where conflicts typically occur or at intersection of 

routes) 1 - 6 - 8 

Situation Assessment—Anticipation strategies 

• Thinking ahead of possible threats (for example, meteorological conditions, congested 
airspace, aircraft malfunctions) to manage uncertainty 5 

• Play out mentally the progression of events 5 


Attention and Workload Management Strategies 

• Increasing attention for aircraft depending on control action or situation 21012 

• Switch attention between tasks on the basis of available time and importance 5 ' 9 ' 10 ' u 

• Saving attentional resources to keep workload at an acceptable level (for example, 
avoiding monitoring) 9 ’ 14 


Problem-solving Strategies 

• Refer to previously used solutions 8 

• Speed separation (within streams) versus altitude control for aircraft separation 12 

• Narrowing or dividing a problem into smaller parts 14 

• Partially solving a problem and fine tune later 14 

• Formulating simple solutions (for example, few actions, less coordination) 14 

• Increasing safety buffers to manage uncertainty or become more cautious 910 ' 13 

• Preventing potential problems/mitigate consequences before they result in a problem 
situation 5 


Planning Strategies 

• Formulating a back-up plan/alternative plan in case initial plan does not work 5 - 9 

• Reverting to standard routings or a routinized working method to ensure safety instead of 
efficiency 9 ’ 10 - 11 

Decision-making Strategies 

• Being selective in when to intervene, depending on probability or risk (wait and see 
versus acting immediately) 1 - 9 


References: 

1. Amaldi & Leroux (1995) 

2. Bisseret (1971) 

3. Nunes & Mogford (2003) 

4. Rantanen & Nunes (2009) 

5. Malakis et al. (2010) 

6. Histon & Hansman (2008) 

7. Redding et al. (1991) 

8. Seamster, Redding, Cannon & Ryder & 
Purcell (1993) 


9. D 'Arcy & Della Rocco (2001) 

10. Sperandio (1971) 

11. Sperandio (1978) 

12. Gronlund, Olirt, Dougherty, Perry & 
Manning (1998) 

13. Loft, Humphreys & Neal (2003) 

14. Flynn & Kauppinen (2002) 


Source: Schuver-van Blanken & van Merrienboer, 2012 
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Controller strategies in disturbed inbound peak operation—exploratory study 

The scope of available research in controller strategies has predominantly focused 
on strategies in standard situations, part-task experimental settings or situations of 
high or low traffic volume or density. The question remains which set of strategies 
are used by controllers in response to disturbed operational situations and whether 
other strategies exist in addition to those found in literature. Therefore, we started 
an exploratory study focusing on the research question: Which strategies do 
radar controllers use in response to disturbed inbound peak ATC operation? The 
focus of the study is on radar control, both area and approach control at ATC 
the Netherlands in dense, busy airspace around Amsterdam Airport Schiphol. 
The elements of the exploratory study are described below (see also Schuver-van 
Blanken & van Merrienboer, 2012, Schuver-van Blanken & Roerdink, 2013). 

Retrospective interviews 

To reveal the strategies used in disturbed operational situations, the method of 
semi-structured retrospective interviews with expert ATC Controllers based 
on the critical decision method (CDM) (Klein, Calderwood & MacGregor, 
1989) was applied. CDM is a retrospective semi-structured interview technique 
that uses probing questions to elicit thinking processes. A replay of the traffic 
handling in the disturbed situation based on a radar data recording system at ATC 
the Netherlands was used as a basis of the retrospective interview. Three expert 
radar controllers have been interviewed (average operational experience 20 years 
as ATC Controller) on the disturbed operational inbound peak situation they 
handled themselves during their operational duty (see Schuver-van Blanken & van 
Merrienboer, 2012). The retrospective interviews revealed 12 events comprising 
disturbances and unpredictable events in inbound peak operation (comprising 
complex conflict situations, bunches of aircraft, traffic delays, blocking RT, 
non-standard aircraft, regional aircraft, crossing traffic, limiting maneuvering or 
solution space, and heavy weather circumstances). 

Focus group 

A focus group is defined as a “carefully planned series of discussions designed to 
obtain perceptions on a defined area of interest in a permissive, non-threatening 
environment” (Krueger and Casey, 2000, p. 5). The ACoPOS model formed the 
basis for the structure of the focus group. Eleven area control radar controllers at 
LVNL participated, with an average operational experience of 17 years. The focus 
group duration was 1.5 hours. A common mindset was created on typical inbound 
peak operation in the morning at a dense area control sector, using four short movies 
with typical examples of operational traffic handling. ACoPOS was used as a basis 
for a shared understanding of the (disturbing) factors present in typical inbound 
peak operation as well as to systematically address the cognitive processes involved. 
Expert insights were generated by means of probing questions to get indications 
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for strategies in typical inbound peak operation. Both individual insights as well as 
group opinion results were collected. The results covered both the answers on the 
probing questions, as well as the explanations on how they act and why. 

Probing questions 

Both in the retrospective interviews as well as in the focus group, probing questions 
were used to elicit controller thinking in response to typical disturbing events in 
the inbound peak operation. The probes in this study were designed to investigate 
the underlying principles the controller used when handling a disturbance. Probing 
questions were selected focusing on the thinking processes of the ATC Controller, 
coupled to the cognitive processes involved in ATC distinguished in the ACoPOS 
model. For example, probing questions for situation assessment focused on 
the important information or triggers in the situation, the characteristics in the 
situation, and the kind of possible events or consequences taken into account. 

Data analysis 

Meaningful segments of the 12 analyzed events from the transcript of the interviews, 
the overall focus group opinions, as well as the individual results in the focus 
group were categorized into the ATC Controller cognitive process involved. The 
categorized results were coded by the underlying strategies that may apply, based on 
the list of strategies compiled from literature (see Table 6.1). In the cases where an 
indication for a new strategy was found (coded as “other strategy” within a cognitive 
process category), a characterization of the underlying strategy was provided in 
relation to the specific event and the underlying goal for task performance. 

Results—controller strategies found in handling disturbances in inbound 
peak operation 

Both in the retrospective interviews as well as the focus group, controllers 
indicated they adapt their task performance to be able to serve one or more of the 
following three high-level goals in handling disturbances: (1) to mitigate cognitive 
complexity; (2) to prioritize or balance safety, efficiency, and environment; and 
(3) to cope with high workload. Overall, all strategies from the literature were 
found in the analyzed events in the retrospective interviews with operational 
disturbances (Schuver-van Blanken & van Merrienboer, 2012). In addition, 
Schuver-van Blanken & van Merrienboer (2012) and Schuver-van Blanken & 
Roerdink (2013) found indications for new types of strategies supplementary to 
the strategies available from literature. 

The characterization of the new types of strategies within each cognitive 
process that were found in addition to those in the literature are displayed in Figure 
6.3. These new types of strategies were present in both the retrospective interviews 
(based on at least two-thirds of segments with the same characterization coded 
as “other type of strategy” within a cognitive process) as well as the focus group 
results (both group opinion and individual expert insights). 
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Figure 6.3 New types of controller strategies found in the 
exploratory study 

Search for planning in formation 

Planning information is actively searched for, regarding (updates of) the expected 
approach time, aircraft delay information, the amount of delay in general or the 
expected or planned inbound aircraft (not yet on the radar or still in the previous 
sector), the departing moment of outbound traffic, and the expected amount of 
inbound/outbound traffic. 

Determine the overall operational (OPS) situation 

The overall operational (OPS) situation is determined in a perspective being broader 
than the traffic situation. This includes weather, wind and visibility circumstances 
and expectancies, the runways in use and changes in runway configuration, 
sectorization, expected operating mode, and airspace (un)availability. ATC 
Controllers identify effects of the overall OPS situation, to be able to adapt 
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their working method to ensure safety and facilitate optimal efficiency in traffic 
handling in the evolving overall OPS situation. 

Look around the corner 

Controllers indicated the importance of "looking around the corner” to determine 
the traffic situation, traffic density, and traffic handling in the adjacent sector or 
by the next ATC Controller within the sector to be able to pro-actively act on this. 

Metacognitive reflection 

Results indicate that attention and workload is guided by a metacognitive 
reflection strategy on the controller’s own action or expertise. This, for example, 
includes trusting their own experience in judgment in a specific situation, versus 
verification of a potential problem situation using tools; withdrawing or adapting 
a decision in case the solution takes too long to take effect or turns out differently 
than expected; increasing attention at a critical moment (for example, moment of 
turn in), or always keeping spare time, such as by only monitoring one situation at 
a time and not trying to monitor two places at once. 

Teaming for problem management 

Teamwork with the adjacent sector, the controllers within the team, or the planner 
controller is important to enable early or partial problem solving or to prevent 
problems. In addition, controllers also indicated that they consider the (observed) 
taskload of the next ATC Controller in the team, where the ATC Controller might 
decide to increase his/her own workload or redistribute tasks to maintain safety 
and efficiency in the sector and the adjacent sector. 

Create/use solution space 

Controllers create solution space or pro-actively use the available maneuvering 
space to solve problem situations. This is also done to prevent problems, to keep 
other solution possibilities available (for example, to keep a vector possibility) 
or to maintain efficiency (for example, for continuous descent). Creating space 
is needed for conflict solution or maneuvering aircraft, by applying non-standard 
routing or moving/transposing crossing points. Creating space is not only applied 
to prevent potential problem situations or to create separation, but is considered 
crucial to keep efficiency for an aircraft (for example, a continuous climb); to 
create time to solve a situation and to avoid domino-effects for other aircraft 
or to prevent increase of attentional resources (for example, avoid the need for 
monitoring a situation). 

Create a (temporary’) pattern 

In traffic handling, controllers create a pattern in their traffic handling, for example, 
by creating a lateral pattern in vectoring or a structural buildup of the vertical 
pattern in holding operation, or to create a traffic flow around weather or hotspots. 
This also helps them to create overview and manage expectancies, and to manage 
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workload and maintain efficiency in non-standard routings, as aircraft follow the 
same non-standard route. 

Take control 

Especially in managing disturbances in inbound peak operation, ATC Controllers 
indicated that they immediately act on the disturbed situation: they actively take 
control (in contradiction to “wait and see”) to avoid the need for monitoring a 
situation. Immediate action is important as time is critical and to ensure pro-active 
traffic handling, to (partially) solve a problem and to prevent possible problem 
situations and pro-actively manage uncertainties. Taking control also enables 
attention and workload management as the need for monitoring a situation, which 
is attention consuming, is prevented. 


Conclusions and Recommendations 

By using the ACoPOS model as a framework, the factors influencing and causing 
cognitive complexity can be systematically revealed and made visible at a glance. 
The ACoPOS analysis of complexity factors in day-to-day disturbances in 
inbound peak operation not only comprise factors in the tactical situation, but also 
strategic, team, procedural, and system factors. Complexity issues that emerge 
from these factors for situation assessment, attention and workload management, 
and problem solving and decision making have been identified. 

By using ACoPOS in combination with probing questions in retrospective 
interviews and focus groups, insights with respect to controller strategies have 
been revealed, by making controller expertise explicit in handling day-to-day 
disturbances in inbound peak operation. Controllers indicate that they adapt 
their task performance to be able to serve three high-level goals: (1) mitigating 
cognitive complexity; (2) balancing safety, efficiency, and environmental 
impact; and (3) coping with high workload. To do so, ATC Controllers employ 
multiple strategies. In addition to strategies available from literature, the results 
of the exploratory study revealed the presence of new (supplementary) strategies 
that were emphasized by controllers to be critical in order to be able to handle 
disturbances in inbound peak operation. This might indicate that the new strategies 
are crucial for mitigating daily disturbances in ATC peak operation. Analysis of 
additional operational cases, both by means of retrospective interviews as well 
as focus groups, is needed to determine which strategies characterize controller 
expertise in disturbed ATC operation. 

Clarifying what constitutes cognitive complexity in operational disturbances 
and which strategies are underlying ATC Controller expertise in these situations, 
enables us both to reduce cognitive complexity in ATC procedures and systems 
as well as to improve ATC training for acquiring ATC Controller expertise. The 
ATC system can be designed in such a way that cognitive complexity of the ATC 
task is reduced in ATC procedures and systems, whilst optimally supporting the 
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ATC Controller strategies that are crucial in response to operational disturbances 
and unpredictable events. Especially in the design of decision support tools with 
increasing automation, it is important to take into account the ATC Controller 
strategies. ATC training can be improved by incorporating strategy learning in 
training as well as ensuring a gradual buildup of cognitive complexity in training 
design. Moreover, it enables us to speak a “common” language explicating 
controller expertise in training, in day-to-day operation, as well as in system 
design processes. 
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Chapter 7 

Ecological Flight Deck Design—The World 
Behind the Glass 


Max Mulder 

Delft University of Technology, The Netherlands 


Perspectives on the World 

Being probably the first plenary speaker at the ISAP symposium from The 
Netherlands, I would like to share with you some of my earliest and truthfully 
“Dutch” perspectives on the world. When I was a young boy I was repeatedly told 
that, although The Netherlands is only a small country, our impact on the world 
and how it was explored and developed in the past hundreds of years, has been 
significant. In the sixteenth and especially the seventeenth century, also known as 
the “Dutch Golden Age,” famous Dutch explorers like Abel Tasman and Willem 
Barentsz, respectively, discovered Tasmania and explored the Arctic. The now 
great city of New York was ours! These journeys still form a major part of our 
Dutch identity, our culture and literature, and in my childhood I devoured the 
incredibly exciting books about the journeys of our great-great-... ancestors. 

One of the journeys that 1 enjoyed reading about most was the attempt of 
Abel Tasman in 1642-1643 to find the, what was correctly assumed to be, huge 
southern landmass called “Zuid-Land,” the Dutch name for “Terra Australis,” now 
known as Australia. What was so remarkable about this particular journey was 
that Abel Tasman sailed all around the Australian landmass without finding it, 
undershooting it at the Southern latitudes, discovering Tasmania (named after him) 
and Nieuw Zeeland (New Zealand), returning to the North via our former colonies 
Nieuw Guinea (New Guinea) and Indonesia (Indonesia), and then overshooting it 
at the Northern latitudes on his way back to the Netherlands. What a remarkable 
feat to circumnavigate this huge continent, and not find it! This tragic but also 
rather funny fact made an everlasting impression on me as a child. 

Reading all these stories about the Dutch (and not to forget the Portuguese 
explorers who were often ahead of us in their discoveries) I often asked myself 
why it was that we were so successful in discovering all this land? Was it a 
typical Dutch quality to go out and explore, as our country is so small, was it 
the everlasting Dutch struggle with water, as most of our country lies below sea 
level, causing us to have many ships and very capable sailors, or was it something 
else? I still remember sitting in class at school and staring again for a long time 
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at the map of the world that we had hanging there, before the insight struck me: 
the Netherlands was almost at the center of the map, very conveniently located 
indeed for explorers to go out and discover new shores! 1 had a simple and elegant 
explanation for our success as world explorers: we were simply living close to the 
center of the world! It all came together and I could not wait to share my incredible 
insight with friends, who were at that time impressed by my all-explaining theory 
and sharp mind. 

I kept this “Dutch” perspective on the world for a long time, until at a later 
age I started travelling myself and on one of my travels I found out that others 
use different maps, like in countries of North and South America where these 
continents are put at the center of their world map. The first time I saw a map 
like that 1 had to really think about what I was looking at, I did not recognize the 
world as I knew it! Later I learned that Chinese maps show China at the center, 
and for a long time they referred to their country as the “middle kingdom.” So my 
perspective of being born in the center of the world was not as unique as I thought. 
It is an elegant and simple perspective, but also wrong: the center of the world lies 
6,378 kilometers below the Earth surface. 

The reason I start with this anecdote is to show the power of representations of 
the world. A good representation helps us to structure our thoughts, to cope with 
uncertainties, and to answer the questions that we may have about why the world 
works like it does. But even good and clever representations, that may function 
for a long time without any problem, may turn out to be incomplete, or wrong. 
It took a genius like Albert Einstein to come up with a totally new relativistic 
perspective on physics that encompassed all of the so successfully used mechanic 
perspective of that other genius Isaac Newton. Furthermore, Einstein’s theory was 
able to explain some of the phenomena that could not be explained in a satisfactory 
manner with Newton’s theory, and it has since then been our dominating view on 
the world of physics. 

In this chapter I would like to first touch upon some of the past, current, 
and possible future perspectives on designing human-machine interfaces and 
automation systems on the flight deck. 1 will then use an example to show how a 
new “ecological” representation of the world may provide a much better support 
for pilots than a conventional engineering representation. The purpose of this 
chapter is to invite and encourage readers to think about our representations of 
the world, whether these representations are suitable to support humans in their 
situation awareness and decision making, or that perhaps better representations 
exist that better suit this purpose. 


Ecological Approach to Flight Deck Design 

In the past decades the flight deck has evolved from a collection of electro¬ 
mechanical instruments dedicated to specific aircraft states to the modern glass 
cockpit that allows integration of the aircraft states into configural graphic images. 
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Additionally, enormous cost reductions have been achieved through introducing 
clever automation, reducing the flight crew to only two persons, and changing the 
role of the crew from manual controllers to supervisors of a highly-automated, 
complex system (Wiener & Curry, 1980; Billings, 1997). 

In the 1960s, the cockpit of commercial aircraft presented all information to the 
pilots, navigators and flightdeck engineer on a large array of electro-mechanical 
instruments. Generally speaking, everything that could be measured was 
presented, in an attempt to provide the humans on-board with as much information 
as possible. The crew then had to integrate all this information, forming a mental 
picture of the current state of the aircraft, predict that state, and act on it in a way 
that satisfied the mission goals best. Most of the cognition was to be done by the 
humans on-board and because of the plethora of information and the dial-and- 
gauge interface design, workload was high and performance was relatively low. 
This led cockpit design engineers to conclude that, apparently, flying the aircraft 
in an accurate and efficient fashion is too difficult for humans and could be done 
better by a computer. 

In the modern cockpit (which often has 120+ computers working in parallel) 
most of the basic flying and optimization tasks have been automated, and most of 
the work to be done and the corresponding cognition needed to perform the job has 
been put into the computer. The modern flightdeck is one that has low workload 
in most phases of flight, to increase only in situations that are unanticipated by 
automation design and that require human ingenuity and adaptive skills. And here 
it is where the other side of the automation coin appears: driven away from the 
basic control loops, the modem flight crew often has low situation awareness, 
potentially leading to human error. The persistence of human error is evidence 
for the fact that also modern flight decks fail to adequately support the unique 
capabilities of humans, the superior creative problem solvers. 

Our approach to the design of human-machine systems, defined here as 
interfaces and automation, is one where the cognitive work is shared by humans 
and automation, aiming at a work environment where the crew is involved in all 
operations, at a reasonable workload level and leading to high situation awareness 
at all times. We agree that most of the work can indeed be performed better (faster, 
more accurate, involving more dimensions for optimization) by machines and 
automation and do not argue against automation per se. The essence lies in how 
we can design automation and interfaces in such a way that they are based on a 
representation of the world that can be shared between the automated and human 
agents, in a joint-cognitive system (Amelink, 2010; Woods & Hollnagel, 2006). 

In the evolution of the cockpit from what it was 50-60 years ago to the 
modem glass cockpit, the aviation psychologists and engineers developed several 
very useful and important interface design principles, starting on the basic level 
of illumination, readability, use of colors and symbols and further evolving to 
the laws of integrated, configural, or object displays, emergent features, and the 
principle of the moving part (Johnson & Roscoe, 1972; Roscoe, Corl & Jensen, 
1981). These principles continue to be valid but provide little help in determining 


106 


Advances in Aviation Psychology 


the “right” representation of the world to facilitate collaboration between humans 
and automation. These principles improve access to data, but they do not suggest 
ways to create human-machine systems that support the cognitive work and 
creative abilities of the human pilot. 

One of the main starting points of our work is the classification of a flight deck 
as an open system (Vicente, 1999). It has many interfaces with its environment, 
namely weather, other traffic, terrain, ATC, and so on. It has an extensive and 
complex interaction with the environment, which makes its operation to be open, 
unpredictable, such that we cannot think of all possible events in advance. We 
really need the adaptability of the humans to deal with the unanticipated variability. 
So what could be an approach to automation and interface design that helps human 
pilots with their cognitive work? 

In a sense, the really old cockpits are classic examples of a design philosophy 
called a “single sensor, single indicator” (SSSI), where you basically present all 
information that you have in a readable format, communicating with the humans 
on the level of signals (Vicente & Rasmussen, 1990). Naturally it would be very 
difficult for pilots to integrate all this information and that’s how the automation 
came into play, putting a lot of the cognitive work to be done into programmable 
computers which then, based on some algorithm, would tell pilots what to 
do, communicating with the humans on the level of signs, intended to elicit 
predetermined solutions to situations anticipated in the design of the automation. 
We believe the current cockpit fails to properly support humans in dealing with 
emergent situations that were not anticipated in the design of automation. To deal 
with the inevitable unanticipated variability in this complex domain, it becomes 
necessary to also support productive thinking that enables the pilot to discover or 
creatively invent solutions to these emergent problems. This requires that the pilot 
has representations of the deep structure of the work domain. 

Graphical representations that provide patterns that are linked naturally to 
functionally relevant relations among the state variables are of course preferred, as 
humans can learn to recognize distinctive patterns and thus they can be “aware” of 
the situation with minimal cognitive effort. In our work we aim to develop human- 
machine systems that literally “show” and “work” on the problem space in such a 
way that the pilot can use the display representation as a template. This “problem 
space,” however, is not normally visible to the human eye, as in our daily-life 
activities such as walking around, drinking coffee, throwing a ball, or riding a 
bicycle. In his “ecological” approach to visual perception, Gibson emphasizes 
the “direct perception” capabilities of humans, and the direct couplings that exist 
between human perception and action (Gibson, 1966, 1979). He introduced the 
concept of affordance, possibilities and constraints for actions and achieving 
goals, specified by the natural environment. Take for example a pile of large rocks 
in the wilderness. Depending on the situation at hand, a tired stroller may see the 
pile of rocks as a means to rest; when lost the rambler may see the pile of rocks 
as an opportunity to climb and move to a higher viewpoint; when it starts raining 
the stroller may see an opportunity for shelter. This is just a subset of all possible 
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meanings that the pile of rocks may have for an actor in the environment, all 
specified by the natural display that can be perceived directly. 

Vicente and Rasmussen took this stance when proposing their “ecological 
approach” to interface design for complex systems, to use technology in creating 
interfaces that provide meaningful information and that allow the human operators 
to directly act on the information to achieve their goals (Vicente & Rasmussen, 
1990, 1992). Their basic idea was to “make visible the invisible,” that is, to transfer 
the cognitive process of understanding and interacting with complex systems to a 
perceptual process, where operators interact with representations of that complex 
process on (usually graphical) interfaces. Other than in interacting with the natural 
world, complex systems often do not allow humans to just “step-in and explore” 
these systems, rather, the interface is the medium for interaction. In essence then, 
an Ecological Interface Design (EID) shows the deep structure of complex work 
domains in a way that is compatible with human perception. 

In his book Cognitive Work Analysis, Vicente (1999) proposed six steps in the 
development of an ecological display: work domain analysis, control task analysis, 
strategies analysis, an analysis of social organization and cooperation, worker 
competencies analysis, and finally the interface design. From these steps the Work 
Domain Analysis (WDA) is the most important one, as here the human-machine 
system designer must analyze the basic functioning of the work domain for which 
the system has to fulfil its purpose. Rather than trying to understand the cognitive 
processes that may guide the operator in doing the work, the WDA focuses on 
the environment and the ways in which the world constraints and physical laws 
afford actions. Developing an appropriate representation of this “action space,” 
independent of the human or automated agent, in fact a representation that is 
“true” for both, stands at the center of the ecological approach. 

In the past decade we designed and evaluated a number of ecological interfaces 
for the flight deck. Examples are a Total Energy management display for basic 
aircraft symmetrical flight control, that enables pilots to understand and act on 
exchanging their aircraft potential and kinetic energy (Amelink, Mulder, Van 
Paassen & Flach, 2005), a Separation Assistance display that allows pilots to 
better understand and act on other traffic (Van Dam, Mulder & Van Paassen, 
2008), an ecological Synthetic Vision display (Borst, Suijkerbuijk, Mulder & Van 
Paassen 2006, Borst, Sjer, Mulder, Van Paassen & Mulder, 2008, Borst, Mulder, 
Van Paassen & Mulder 2010). We also explored various EID designs for ATC 
in current and future air traffic management environments (Van der Eijk, Borst, 
In ‘t Veld, Van Paassen & Mulder, 2012, De Leege, Van Paassen & Mulder, 2013, 
Klomp, Van Paassen, Mulder & Roerdink, 2011, Klomp et ah, 2012, Tielrooij, 
In ‘t Veld, Van Paassen & Mulder, 2010, Van Paassen et ah, 2013). 

In the keynote I showed an example of an ecological display that we developed 
for arrival management in a future 4D TBO terminal area (the reader is referred to 
Klomp et ah, 2011, and also Van der Eijk et ah, 2012; De Leege et ah, 2013). It is 
a rather complicated design, and explaining how it works and what affordances of 
the complex world of terminal operations it shows would take me several hours. 
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The reason I showed it is to make very clear from the beginning that ecological 
interfaces are not by definition simple, easy-to-use interfaces that very quickly 
turn novices into experts. That is a common misconception on E1D. Rather, 
ecological interfaces are designed for complex work and the complexity of the 
work domain is reflected by the complexity in the visual interface (Flach, 2012). 
Ecological interfaces are made by experts for experts, as it really requires the 
analyst to understand the problem space of the work domain extremely well. In 
creating ecological interfaces the members of my team have become true experts 
in their application. 


Example: An Ecological Separation Assistance Display 

Setting the stage 

I would like to illustrate our approach through discussing an example. The scope 
of the example is small, for the sake of space and also for the sake of having an 
example that can be understood and appreciated by most of you. The subject will 
be the development of an interface that supports pilots in the task of maintaining 
a safe separation with other traffic in the vicinity of their own aircraft. Currently 
this is a task done by ATC but in the future, parts of the airspace may become 
“unmanaged,” and here the pilots and their automation systems will take care of 
the separation task (SESAR, 2007). For the sake of simplicity the example will 
be limited to the two-dimensional case; describing our three-dimensional or even 
four-dimensional interfaces would require too much space and would hamper an 
understanding of the basics of the ecological approach, which is the purpose of 
this chapter. 

An airborne separation assistance system, ASAS for short, involves “the 
equipment, protocols, airborne surveillance and ... which enable the pilot to 
exercise responsibility, ... for separation of his aircraft from one or more 
aircraft” (ICAO SICASP/6-WP/44). ASAS functionalities, that is, the work to 
be conducted by the automation and/or pilot, include: (1) maintaining an overview 
of the surrounding traffic; (2) detecting potential loss of separation conflicts; (3) 
resolving conflicts; and (4) preventing aircraft from running into new conflicts. 
Note that a conflict is defined as a potential loss of separation, in the future. The 
development of ASAS systems has received a lot of attention in the past decades 
and various prototypes have been built and tested (see for an overview Hoekstra, 
2001). In the following, some of these designs and the approaches that led to them 
will be briefly discussed. 

Common to many ASAS designs is that they are based on accurate trajectory 
prediction algorithms which compute the “closest point of approach” (CPA) 
and then have another computer algorithm “reason” about the best way to deal 
with situations where the CPA is predicted to become too small. Typically these 
algorithms are put into a computer, and then the interface designer is brought into 
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play to create the interface. In the light of the discussion above: cognition is being 
put into the computer, hidden from the pilot, and communication is done at the 
level of signals (where is the other aircraft?) and signs (are we moving too close? 
warn the pilot!). 

Not surprisingly, in many of the ASAS evaluations with pilots the typical 
pitfalls and ironies of automation (Bainbridge, 1983; Parasuraman and Riley, 
1997) have come to the surface: hidden rationale, confusion of the automation 
intent, disagreement, lack of trust or overreliance, and low situation awareness. 
Why does the automation propose this solution? What will happen when I follow 
the automation’s advice? What if 1 don’t?? Apart from these issues, it is a fact that 
there will always be cases that the automation designers and engineers did not 
think of, because of the open and complex nature of interaction of the aircraft in its 
environment. And many examples exist of cockpit automation that is only aware 
of a part of the context (for example, terrain) and ignorant of other constraints 
to flight (for example, traffic). The automation often will not support pilots in 
these complex, multi-constraint situations while the latter remain to be ultimately 
responsible for the operation (see definition above). 

As a final note before we start with the analysis, keep in mind that self¬ 
separation problems typically evolve very slowly. They are not to be confused 
with TCAS systems that operate on aircraft getting too close to each other within 
a horizon of 40 seconds. ASAS systems work with time horizons of three to five 
minutes. Thus, the aircraft will be quite a distance apart, requiring pilots to zoom 
out their navigation display to see them, which in turn will cause these aircraft to 
move on the display very slowly. Hence, although the approach and movement 
itself is “visible,” it will be difficult for pilots to explore possible conflict situations 
and the way to resolve them from the basic navigation display, as the resolution 
and scale of changes-in-time will make potential conflicts difficult to see. Clearly 
there is a need here to make the separation problem more “compatible” to human 
perception. 

Work domain analysis 

In our lab we worked on this ASAS problem for a number of years, asking 
ourselves whether there woidd be a different way to create an ASAS system; 
stated otherwise, would there be a different representation of the traffic separation 
problem (other than CPA-based calculations) that would be better suited to present 
to the human operator, communicate with him or her at the “symbol” level, so that 
he or she understands the situation at a glance and can act on it in a proper way, 
with or without the help of automation? Would this representation allow for the 
design of the interface and the automation to create a joint cognitive system? 

We started with numerous computer simulations, trying to figure out what 
are the physical laws and abstract functions that govern the dynamics of the 
separation control problem. We used Rasmussen’s Abstraction Hierarchy (AH) 
as a reference, considering the separation assistance problem and the system that 
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deals with it at five levels of abstraction: Functional purpose, Abstract function, 
Generalized function. Physical function, and Physical form (Rasmussen, Pejtersen 
& Goodstein, 1994). 

These simple computer simulations of aircraft flying in a two-dimensional 
airspace led us to investigate and define travel functions that define the problem 
and solution space of self-separation (Van Dam, Mulder & Van Paassen, 2008). 
We realized that these travel functions form the core of the separation problem 
and act at the Abstract function level of the AH: absolute and relative motion, 
and separation. Manipulating the relative motion of aircraft requires aircraft 
to maneuver and to coordinate these maneuvers such that the separation was 
maintained at all times; these are the Generalized functions. 

At the highest level. Functional purpose, the goal of having an ASAS system 
could be defined, which is to ensure safety at all times, which was obvious from the 
start, but the simulations led us to add two more: be productive and efficient. These 
were added since for particular geometries we found out that some maneuvers 
were indeed safe, but would lead to situations where aircraft needed to make a 
more than 90 degree turn (or even fly back) or that it would take very long for the 
separation problem to be resolved. The resulting AH is shown in Figure 7.1. At the 
Physical function level we see the actual traffic that flies within the vicinity of the 
own aircraft, and the control units that pilots have to manipulate the Generalized 
functions: their cockpit interfaces to autopilot, throttle and flight management 
systems. At the Physical form level we see the state of the own aircraft and the 
locations and states of the other aircraft involved. 



Figure 7.1 Abstraction hierarchy (simplified) for the separation 
assistance system 
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Note that this AH has been subject to numerous iterations, as can be seen in 
our publications on this subject over the years (Van Darn, Mulder & Van Paassen, 
2008). Further note that we have been struggling with the AH for this problem for 
quite some time as, other than in the process control domain where the Abstract 
and Generalized functions can be quickly connected to the physics of the plant 
being controlled, in this aircraft separation problem the “physics” were not clear 
from the beginning. Of course, the physics of aircraft flight dynamics were there 
for us but we found out that these were not very meaningful in this particular 
problem as they well describe the motions of a single aircraft but not the physics of 
separating two or more aircraft. We developed our own “meaningful physics” for 
this problem through the computer simulations stated above, yielding the “travel 
functions.” 

Traditional and ecological approach 

Now, when considering the “typical engineering approach” in the context of 
the AH that comes from the work domain analysis, we see that the computer 
algorithms are programmed to “understand” and “work on” especially the 
Abstract function and Generalized function levels. Through the cockpit interfaces, 
the pilots are shown the elements of the physical environment (other aircraft), the 
Physical form level, they have their control buttons and dials to provide new set- 
points to their automated agents, the Physical function level, and they are trained 
to understand the signals and signs that the ASAS system provides them at the 
Functional Purpose level. In this design, pilots will understand why the system is 
there (functional purpose), they are trained how to work with the system (Physical 
function, physical form) yet they have little insight into how the system actually 
works and deals with the environmental constraints (Abstract and Generalized 
function levels). 

The rationale behind the signals and signs, however, is hidden in the 
automation, and the pilot has little insight into understanding how the computer 
has interpreted and dealt with the traffic situation at the Abstract and Generalized 
function levels. And indeed this is “typical” for many of the human-machine 
systems and automated tools that have been developed for the modern flight deck, 
hiding the rationale from the pilots, putting the real cognition and processing of 
data and situations into actions and advice in pieces of automation that are non¬ 
transparent, leading to low situation awareness, workload peaks, and all the typical 
ironies of automation. 

Clearly then, in an E1D approach, the rationale of the automated algorithms 
and the invisible but crucial elements of the world domain should be visualized. 
In our designs we therefore aim at making visible the invisible, showing pilots 
the “world behind the glass” at the Abstract and Generalized function levels, such 
that with or without automated help they can reason about the traffic situation 
themselves. Without automation they should be able to detect and resolve conflicts 
themselves and also to do it in a way that is safe, efficient, and productive. With 
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automation in place pilots should be able to (much) better understand the signals 
and signs (warnings and resolution advisories) that the automation provides, as 
the communication will also show the deep structure that provides a context for 
interpreting the meaning of these signals and signs. Again note that ecological 
interfaces are not intended to put the automation aside, on the contrary, they 
facilitate coordination by creating the transparency that is needed for pilots to 
understand situations and judge the logic underlying the automation’s actions 
(Amelink, 2010). 

Traditional display design 

In the past decades much research has been conducted on the self-separation 
problem (for example, in the “Free Flight” programs that ran in the late 1990s). 
Numerous attempts were made to better support the pilots in understanding 
the essence of traffic conflicts and how the automation deals with them. Early 
visualizations showed the point of closest approach (CPA) on the navigation 
display, often graphically put onto the display as ellipsoidal “no-go” zones. The 
affordance of the “hit” is then clear for pilots, that is, when their planned trajectory 
crosses any of these no-go zones, they will see and rapidly understand that the 
current path will lead them into a conflict with another aircraft. The affordance 
of the “avoidance” is only partially clear, as only the conflict resolution through 
heading changes can be perceived whereas it could well be possible that also a 
(slight) change in the aircraft velocity would be sufficient to get out of trouble. 
The effects of flying a (little) slower or faster on the conflict are not well specified. 

Evaluations with these no-go zones showed that often new conflicts were 
triggered by maneuvers initiated to resolve other conflicts. A situation that is also 
very true for current ATC operations, where an analysis showed that close to 50 
percent of all short-term conflict alerts were caused by operator’s responses to 
previous alerts (Lillo et ah, 2009), leading the engineers and display designers to 
come up with the concept of heading and speed bands, computed by automation, 
that show all possible headings of the own aircraft that would result in a conflict 
(assuming constant current speed) and all possible speeds that woidd residt in 
a conflict (assuming constant current heading), respectively. Later a computer- 
aided “optimal” solution was also shown, usually a combination of a small speed 
and heading change, that was the best and most efficient way out of the conflict 
(Floekstra, 2001). 

With these speed and heading bands, and the optimal solution presented, pilots 
indeed can see how to avoid other aircraft. They have a hard time, however, finding 
out themselves what would be the most efficient way to resolve the conflict and 
especially to see and check whether the computer-aided solution and heading and 
speed bands are in fact correct. And the optimal solution very often appears right 
into the heading and speed bands that act as “no-go” states, causing confusion and 
a lack of confidence (automation irony at work). When the own aircraft is involved 
in a multi-aircraft conflict, more and more no-go bands would appear which would 
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be very difficult for pilots to relate to the particular aircraft involved. This iteration 
of typical engineering and interface design did not end lip with an easy-to-use 
interface and the resulting representation of the problem (CPA, heading and speed 
bands) has in fact obscured the way the world works. This representation of the 
dynamics of separation assistance is a dead end. 

Ecological display design 

We took a different approach to the problem, based on visualizing the full 
affordances of relative and absolute motion. For a comprehensive description of 
the design and the process we have gone through, the reader is referred to Van 
Dam, Mulder & Van Paassen, 2008. 

When it is assumed that we know the locations and velocities of all aircraft 
within the vicinity of our own aircraft, then we can easily compute the set of 
relative velocity vectors that will bring us into a conflict situation with that other 
aircraft. The task of the pilot is then to manipulate the velocity vector of the own 
aircraft, its direction (=heading of the own aircraft) and magnitude (=speed of the 
own aircraft), in such a way that it does not belong to this set. Stijn van Dam was 
the first to develop an own aircraft-centered presentation of this relative motion, 
a presentation that in a symbolic way shows the affordances of “hit” and “avoid” 
that can be directly perceived and acted upon by the pilot (or automation). We later 
found out that in robotics theory, the collision cone (Chakravarthy & Ghose 1998), 
velocity obstacle (Fiorini & Shiller, 1998), and maneuvering board techniques 
(Tychonievich et ah, 1989) were developed for a very similar problem, and again 
later found out that the so-called Battenberg course indicator (dating back to 
1892!) aims to visualize similar maneuvering constraints for ships. 

The ecological ASAS display is illustrated in Figure 7.2 in its most elementary 
form. It is a two-dimensional semicircular presentation that can be used as an 
overlay in the traditional navigation display of modern cockpits. We developed 
a vertical presentation (Fleylen, Van Dam, Mulder & Van Paassen, 2008), a 
combined co-planar horizontal/vertical presentation (Ellerbroek, Brantegem, 
Van Paassen & Mulder, 2013) and a three-dimensional orthogonal presentation 
(Ellerbroek, Visser, Van Dam, Mulder & Van Paassen, 2011) as well. Central in the 
presentation is the velocity vector of the own aircraft. The small and large semi¬ 
circles indicate the minimum and maximum velocities of the aircraft, respectively, 
meaning that the size of the velocity vector cannot exceed these limits, constraints 
“internal” to the aircraft and caused by performance limits (Physical function). 
The size of the vector can be changed, indicating speed changes: it can be made 
larger (fly faster) or smaller (fly slower), but the length cannot exceed the velocity 
limits. The vector can also be rotated to the left and right, indicating heading 
changes. Heading changes larger than 90 degrees left or right are possible but are 
considered to be not very productive (functional purpose). 

The crucial element of the display is the triangular-shaped zone that visualizes 
the set of own aircraft velocity vectors that will result in a conflict with another 
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Figure 7.2 A (highly) simplified version of the Ecological separation 
assistance display: the “state-vector” envelope for 
2-dimensional motion 

aircraft present in the vicinity. All heading and speed settings of the own aircraft 
that result in a velocity vector pointing in this “forbidden beam zone” will be 
unsafe (functional purpose). Vice versa, all heading and speed settings of the 
own aircraft that result in a velocity vector pointing outside this zone are safe. 
These are the constraints to our own aircraft motion that are caused by the other 
aircraft flying in our vicinity, the “external” constraints to our flight (Abstract 
function). With this display, pilots can directly perceive that they are in trouble 
and that many options exist to get out of trouble by changing either speed, or 
heading, or both (Generalized function). Pointing the own aircraft velocity vector 
below the zone will mean that the other aircraft will pass us in front, pointing 
the vector above the zone will mean that we will pass the other aircraft in front. 
It shows the consequences of our possible actions in a directly perceivable way. 
It directly visualizes the dynamics of relative motion (Abstract function) and the 
ways to fulfill our functional purposes through manipulating this relative motion 
(Generalized function). We have indeed “made visible the invisible,” connecting 
the means of flying with the ends of flight, making this a truly ecological interface 
(Van Dam, Mulder & Van Paassen, 2008). 

When working with this representation we obtained some important insights. 
First of all, the state-vector envelope presentation shows the complete “solution 
space” to pilots, and includes all possible heading bands and speed bands (see 
Figure 7.3a) of the original design. With this visual, symbolic presentation of the 
problem the pilot can easily see the optimal solution, as it is the smallest state 
change of the own velocity vector that will be pointing outside the zone, (see 
Figure 7.3b). And the most interesting property of all is that when more aircraft are 
flying close, all these aircraft may cause external constraints that further limit the 
own aircraft motion possibilities, “eating up parts of our solution space” as we call 
it, see Figure 7.3c. This makes the display also suitable for resolving multi-aircraft 
separation assistance problems, although determining the “optimal maneuver” 
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Figure 7.3 Example of how the ecological separation display specifies 
all the constraints A) the display contains all “heading 
band” constraints; B) the display contains all “speed band” 
constraints; C) the display specifies the “optimal solution”, the 
smallest state change; D) the display can easily and intuitively 
show the constraints of multiple aircraft conflicts 
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may be difficult and could be done best by an automated agent. Again, the EID 
does not prohibit the use of automated help; on the contrary, it could well be the 
transparent window to the automation that is so much needed when pilots are there 
to verify the automated agents’ advice, a task for which they will ultimately be 
responsible. And note that the internal and external constraints as visualized on the 
ecological interface are a property of the “world behind the glass” that also holds 
for automation: the WDA is actor-independent. 

At the start of this chapter, we have asked ourselves the question whether 
there exists an approach to automation and interface design that helps pilots in 
performing the cognitive work. The answer is “yes, it exists.” It is the ecological 
approach to human-machine systems design that captures the essence of what is 
needed to construct interfaces and automation that allow human and automated 
agents to work together. When considering the differences between traditional 
designs and ecological designs, the ecological designs are richer and provide more 
meaningful information about the conflict situation. It allows the pilot to obtain a 
deeper understanding of the situation and the context of work, and the visualization 
of relative motion allows the pilots to directly observe the possibilities for actions 
and the consequences of taking an action. This is what traffic situation awareness 
is all about! It all starts with defining the events of interest and the ecological 
interface is a window on these events, visualizing the nested set of constraints 
that have the potential to shape performance and that constrain or allow possible 
actions (Flach, Mulder & Van Paassen, 2004). 

At the core of the design is the WDA which helps the analysts and designers 
to become experts in the problem at hand, understanding the functional means- 
ends relationships of the system-to-be-built, independent of who will do the actual 
work, the human or the automation, or both. It is the WDA that will show which 
elements of the work domain are so crucial that they have to be visualized on 
the display, to support knowledge-based behavior driven by representations of 
the deep structure of the work domain. It is the WDA that will lead the analyst 
to better understand what representations of the world exist and that could be 
used for the system design. The iterations that follow, involving prototyping and 
testing, may lead to novel insights into the problem and may result in adaptations 
of the analysis, the representation, and the interface. One of the things we learned 
is that the analysis takes a long time, and you have to become an expert yourself. 


Closing Statements 

In this chapter 1 have presented some of our work on taking an ecological approach 
to flight deck design. Our approach is to strive for a joint cognitive human- 
machine system, where cognition can be distributed in a dynamic way between 
humans and the automated systems through an ecological interface that provides a 
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joint “window on the world” or, in other words, that represents the deep structure 
of a work domain. 

1 have shown that ecological interfaces are not, by definition, simple. On the 
contrary, a good ecological interface reflects the complexity of the work domain. 
An important consequence is that in order to create an ecological interface for 
a complex work domain, the work domain analyst and interface designer (often 
the same person) must become an expert him- or herself. There is no recipe to be 
followed in the design, and the WDA for vehicles or other dynamic systems often 
requires deep knowledge and a good understanding of physical laws and the many 
constraints that come with them. This renders the ecological approach to human- 
machine systems design, interface and automation, a difficult one. 

In our lab the approach to EID usually starts with (a lot of) engineering 
analysis, modeling and describing the system at various levels to “fill in” the AH 
that then undergoes numerous iterations often in combination with evaluations 
of the prototypes developed. We go through many analysis and design iterations. 
What often helped us in developing new prototypes was that the graphs or sketches 
that we used to explain to ourselves the dynamics of the “problem space” were 
programmed into a computer, as a first prototype, to act as dynamic windows on 
how the part of the world that we analyzed actually “worked.” These first simple 
representations of action and problem spaces then often turned out to be already 
quite close to the final design and their dynamic nature helped us tremendously 
to figure out where our thinking was insufficient, or where our representation or 
world model was simply wrong. 

An important lesson from this chapter is that picking the “right” representation or 
state variables is often crucial to the success of the automation and interface design. 
When considering the modern flight deck and the automation on board, many 
systems are based on representations of the world that we now know are incomplete. 
An example is the current autopilot/auto-throttle combination that is based on the 
small-perturbation flight dynamics world model, whereas we know since the early 
1970s that an energy-based representation would actually have been a much better 
starting point for aircraft flight control (Lambregts, 2013). These legacy systems 
will remain operational for the decades to come, and will continue to cause human 
factors problems such as mode awareness, mode confusion and the like. 

1 hope the reader will see that the problem lies not in the automation or human, 
the problem lies in the inappropriate representation of the world that underlies 
the system design. Representations shape our view of things. They determine the 
grounds on which the automation and interface will be developed. To advance 
aviation, its automation, and human-machine interfaces, we should turn our efforts 
more into analyzing appropriate world models, creating meaningful interface 
designs and transparent automation, and solve the cause of the emergent “humans 
and automation” problems, rather than continue merely studying the consequences 
of legacy problems of “bad” design choices. 
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Chapter 8 

Risk Perception in Ecological Information 

Systems 

Jan Comans, Clark Borst, M.M. van Paassen & Max Mulder 
Delft University of Technology, The Netherlands 


One issue that regularly occurs in the context of ecological information systems 
is that these systems can invite operators to migrate to the limits of system 
performance. This could lead to the assumption that ecological systems are 
thus inherently unsafe. We argue, however, that the source of this issue is tied 
to a modeling problem of the work domain. That is, the majority of ecological 
systems predominantly model the physical or causal structure of the work domain, 
thereby neglecting the intentional structure. Many complex socio-technical 
systems contain a mix of causal and intentional constraints—rules, procedures, 
and regulations—that contribute to safe operations of those systems. The work 
described in this chapter examines how visualizing intentional information in an 
ecological synthetic vision display affects pilot behavior, decision making, and 
safety in a terrain avoidance task. An experiment with 16 professional pilots 
showed that adding an intentional constraint increased the clearance during terrain 
avoidance and gave them more insight into the terrain avoidance task, which 
enabled them to make better tradeoffs between safety and performance. 


Background 

Cognitive Systems Engineering (Rasmussen, Pejtersen & Goodstein, 1994) and 
Ecological Interface Design (E1D) (Vicente, 1999) paradigms are commonly 
regarded as guiding frameworks to develop transparent automation, allowing 
human agents to monitor the machine and fluently re-direct machine activities 
warranted by the demands of the situation. The guiding principle is for the interface 
to represent the field of possible solutions, rather than a single solution. This gives 
the human agent an opportunity to choose a solution that is adapted to dynamic 
situational constraints that might have been difficult to specify in the design of 
automated systems. Such an approach is seen as more robust and resilient, and can 
be contrasted with a brittle automation design that provides optimal advice most 
of the time, but fails spectacularly in a few cases. 
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Although empirical studies have shown that such information aids enable the 
human to have a better system understanding and a better notion of the physical 
limitations, possibilities, and relationships within the work domain, humans 
often tend to propose actions that are sub-optimal, good enough, or even pushing 
the envelope (Rasmussen, 1997). For example, Borst, Mulder, and Van Paassen 
(2010) showed that their ecological synthetic vision display invited pilots to 
routinely violate minimum terrain clearances. This could lead to the assumption 
that ecological information systems are unsafe and can promote risky behavior. 

Although we share the same concern about seeking the limits of performance, 
we argue that the risky behavior is tied to the scope of the work domain analysis 
(WDA) of the system that is modeled rather than the EID framework itself. That 
is, the majority of ecological systems predominantly model the physical or causal 
structure of the work domain, thereby neglecting the intentional structure like 
rules, procedures, and regulations (Hajdukiewicz, Burns, Vicente & Eggleston, 
1999). 

For example, aviation safety is not only accomplished by the technical systems 
on board an aircraft, but also by standardized communication and coordination 
protocols, procedures, and airspace organization. So when the scope mainly 
includes the causal constraints, the physical structure in the environment will be 
made compelling and this can cause people to pursue these causal boundaries, 
leaving little room to prevent accidents. On the other hand, when the scope 
mainly includes intentional constraints, the system will generally be safer, but 
the operational range of causal systems can be significantly limited to effectively 
solve problems in novel situations. The EID approach, however, can also be used 
to make both the causal and the intentional constraints visible and it can also 
manipulate the relative salience of those constraints. 

In this chapter, we explore how an enhanced synthetic display with both 
intentional and causal constraints shapes pilot behavior and decision making in 
a terrain avoidance task. As such, it aims to answer the following question: when 
pilots are explicitly confronted with intentional constraints in addition to causal 
constraints, will they make better decisions and will they better understand the 
risks involved in those decisions? The work described in this chapter is essentially 
a repetition—in some aspects—of the experiment reported by Borst et al. (2010) 
with the addition of an explicit visualization of the required minimum safe altitude 
above terrain. 


Intentional Constraints 

Traditionally, display design is driven by task and work analyses to discover the 
information required on the display. The advantage of this approach is that it 
provides a display that is well suited for the tasks that need to be performed and 
usually requires limited mental effort. The downside of this approach is that the 
produced displays are not necessarily well suited for novel situations that were 
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not included in the task analysis. Off-normal situations are usually extremely rare 
occasions, but in the context of aviation, they can have severe consequences when 
they are not handled correctly. Furthermore, designers might have an inaccurate or 
incomplete model of the world that will influence the task analysis and resulting 
design. 

E1D was introduced almost 25 years ago as a design framework that aims to 
mitigate some of the drawbacks of task-based decision support systems (Vicente 
& Rasmussen, 1992). Its goal is to create a system that facilitates human creativity 
and flexibility to resolve novel situations unanticipated in the design of automated 
systems. The starting point for an ecological display is the WDA. The goal of 
this analysis is to identify relationships and constraints within the system under 
study that together determine the space of possibilities in how the system can be 
controlled. In the display, in addition to providing the goal states of the system, 
the complete control space is shown to the operator. This allows operators to 
adapt to the specific tasks they have to perform and the context in which they 
appear. Although this may require a slightly higher mental workload, it can enable 
operators to find creative solutions to unexpected events. 

The first step in a WDA is to determine the scope of the analysis. When dealing 
with process control or the dynamic aspects of controlling vehicle motion, the 
analysis will typically be focused on the physical aspects of the system under 
study. When looking at a complete airline operation, however, the focus will 
typically be less on the physical aspects of flying, but more on the financial and 
regulatory aspects, the intentional constraints. 

Up until now, the majority of the work on ecological displays in aviation 
has focused on causal constraints (Borst et ah, 2010; Ellerbroek, Brantegem, 
Van Paasssen, Gelder & Mulder, 2013). One recurring observation when 
experimenting with these displays is that operators will operate at the boundaries 
set by the constraints. In terrain avoidance experiments, for example, pilots will 
be able to clear the terrain obstacle with an ecological display, but they sometimes 
clear it with a very low margin (Borst et ah, 2010). From a physical point of view, 
there is no problem, the obstacle is cleared. From a safety point of view, a low 
clearance is not favorable because it leaves little room for error to respond robustly 
to unexpected events, such as an engine failure during a climb. This boundary¬ 
seeking behavior may lead to the notion that systems employing E1D may be 
inherently unsafe, because they invite operators to push the envelope. Although 
we are concerned about this behavior, we argue that it is not necessarily a property 
of ecological information systems. 

The safety issue described above is part of a larger problem. Actions that 
satisfy causal constraints are not necessarily optimal or desired actions. In aviation 
systems, a pilot’s decisions are also strongly influenced by rules, regulations, and 
procedures. In order to satisfy these constraints, they need to be included in the 
WDA and presented on the display. The lack of safety found in experiments with 
ecological information systems can be traced back to not complying with the rules, 
regulations, and procedures. In the context of terrain avoidance, for example, 
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pilots apparently seemed to disobey the minimum safe altitude restrictions to clear 
obstacles. To include these rules, regulations, and procedures, the scope of the 
work domain needs to be extended to also capture the intentional structure of the 
work domain. Just like their causal counterpart, intentional constraints limit action 
possibilities and can shape operators’ behavior in the work domain. 

Figure 8.1 shows an abstract example of the resulting action space—all 
possible ways in which an operator can reach a goal state—with and without 
intentional constraints. The purely causal action space shows a situation where an 
operator can choose any trajectory as long as it satisfies the causal constraints. The 
action space with the addition of intentional constraints shows how the intentional 
constraints limit the space available to the operator to choose a trajectory. When 
operating within the intentional constraint boundaries, both sets of constraints 
are satisfied. Unlike with causal constraints, an operator can choose to ignore 
intentional constraints. A causal constraint is always a hard limit to the system 
performance and forms the accident boundary. Intentional constraints are soft 
limits that can be violated and therefore form the incident boundary. 

Causal Constraint Causal Constraint 

Boundaries Boundaries 




Figure 8.1 A fully causal action space (left) and an action space with both 
causal and intentional constraints (right) 

In a well-designed system, the intentional constraints will be well tuned to 
the tasks the operator has to perform. It will allow for all actions required to deal 
with a system that performs as expected. In complex domains, like the aviation 
domain, unexpected events will occasionally happen. These unexpected events 
can potentially force trajectories outside the intentional constraint boundaries. At 
this point, it is important for the operator to be able to clearly distinguish between 
causal and intentional constraints. As an example, consider a map display that 
shows noise sensitive areas that should not be overflown together with physical 
obstacles like radio towers that cannot be overflown. Under normal situations, a 
pilot will choose a trajectory that avoids all areas that need to be avoided. Under 
emergency conditions—for example, when an engine fails—the pilot can choose 
to ignore the noise sensitive areas and fly over them if it will increase the safety 
of the flight. In the meantime, the pilot will still have to avoid the radio towers. 
Overflying the noise-sensitive areas does not even have to be an all or nothing 
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situation, the pilot can still decide to try to minimize his impact and find a trajectory 
that produces the least amount of nuisance. 

It will be important for the operator to be able to clearly distinguish between 
both sets of constraints. As discussed before, only showing causal constraints can 
lead to boundary-seeking behavior. On the other hand, only showing intentional 
constraints would decrease the apparent solution space, which can result in 
situations where satisfactory solutions to difficult situations (for example, safe 
violations of soft constraints) are not visible for consideration. 

A non-display related example of this second type of design limitation is the 
Flight Envelope Protection system implemented in all modern Airbus aircraft. 
This system limits the actual control inputs given by the pilot before they are sent 
to the control surfaces to make sure the aircraft does not surpass any structural 
limitations. The advantage of this system is that pilots cannot accidently overstress 
the aircraft, and this works well in most day-to-day situations. However, in 1985 
a crew of a China Airlines Boeing 747 ended up in a situation where the only way 
out of a steep dive was to actually overstress the aircraft. The aircraft got damaged 
because of this maneuver, but was still flyable (National Transportation Safety 
Board, 1985). If a Flight Envelope Protection system had been in place to enforce 
the theoretical design limit, the aircraft would probably have crashed into the sea. 

This example shows that while under most circumstances pilots should be 
conforming to intentional constraints, situations may develop where they need 
to violate those constraints and focus on the causal limits of their vehicle. In this 
light, we argue that operators should be presented with both sets of constraints 
clearly visible. This allows operators to see both required behavior tied to the 
causal constraints and expected behavior tied to the intentional constraints. With 
this they should be able to make good tradeoffs between satisfying intentional 
and/or causal constraints. The following section describes an experiment to test 
this hypothesis. 


Experimental Evaluation of an Intentional Terrain Awareness Display 

Borst et ah, 2010, introduced an ecological terrain awareness display. By combining 
a synthetic terrain display with a visualization of the aircrafts’ climb performance, 
pilots were able to quickly and correctly determine their options when having to 
perform an emergency climb over a terrain obstruction. By visualizing the climb 
performance data, the number of terrain collisions reduced to zero. The terrain 
clearance, on the other hand, decreased significantly, indicating that pilots were 
flying on the edge of the safe flight envelope. As discussed above, we argue that 
this is a result of limiting the scope of the WDAto causal constraints only. 

To test this hypothesis, we redesigned the display to include an intentional 
constraint: the Minimum Safe Altitude. By showing this constraint, in addition to 
the causal constraints, it was expected that the minimum terrain clearance would 
increase and that pilots would opt for more robust and thus safer control strategies 
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to resolve terrain conflicts. By placing pilots in situations that are difficult to resolve 
without violating intentional constraints, we also expect that they will deliberately 
violate the intentional constraints when necessary to meet the requirements of safe 
operations. 

Subjects 

A mix of seven recently graduated and nine commercial pilots participated. Their 
average age was 40 years (SD 16.13) with an average experience of 3,370 flight 
hours (SD 3,923.07). Four of them were Dutch National Aerospace Laboratory 
test pilots. One of them was a former military F16 test pilot. Business jet pilots 
were chosen to be the group of commercial pilots because they are used to more 
dynamic and changing operations than air transport pilots. The recently graduated 
pilots were chosen for their lack of procedural habits. 

Apparatus 

The experiment was conducted in a fixed base flight simulator. The display was 
shown on an 18-inch monitor located in front of the pilot. An outside visual 
consisting of fog and cloud fragments was projected on the front- and side-walls 
to provide some sense of motion. The aircraft model was controlled by a right 
hand hydraulic side stick and a throttle quadrant on the left. The throttle contained 
the trim switch, autopilot disconnect switch and Horizontal Situation Indicator 
center button. A mode control panel on top of the instrument panel was used 
to control the Horizontal Situation Indicator course. A non-linear six degree of 
freedom Cessna 172 model was used for the experiment. Pitch, roll, and throttle 
commands were directly controlled by the pilot. To compensate for the lack of 
rudder pedals, a side slip controller was implemented to minimize side slip and 
engine torque effects. Two different performance settings where used during the 
experiment. During normal performance runs, the model operated in a normal 
International Standard Atmosphere giving the normal performance at the altitudes 
flown. In the reduced performance mode, the aircraft performance corresponded 
to what would be expected at low-density altitude conditions. In this mode, climb 
performance decreases significantly with altitude. In high performance conditions, 
the maximum climb angle was always between 7.5 degrees and 6 degrees, while in 
the low performance conditions the maximum climb angle deteriorated from 4.5 
degrees to 1 degree while climbing to the final altitude. 

Display 

The display used in the experiment is shown in Figure 8.2. it is similar to a Gann in 
G1000 NAV III augmented with a SVS. Three additional cues were added for the 
baseline display. The flight path vector indicated the geometric flight path, which 
provided immediate feedback to the pilot about his current trajectory. If the flight 
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Figure 8.2 The synthetic vision display, showing: (1) The flight path 
angle; (2) the current maximum sustained climb angle; (3) 
the maximum sustained climb angle at full power; and (4) the 
intentional terrain constraint 

path vector is pointing at the synthetic terrain, the aircraft will eventually impact 
the ground at that position. If the flight path vector points above the terrain, the 
aircraft will clear the terrain. The maximum sustained climb angle at full power is 
shown by a wide green bar (no. 3 in Figure 8.2). This indication immediately shows 
whether the aircraft is able to clear the terrain at maximum climb performance. 
When this line is below the synthetic terrain, the pilot will not be able to climb 
over it and will have to use a different maneuver. Similarly, the current maximum 
climb angle (no. 2 in Figure 8.2) is the maximum climb angle that can be sustained 
with the current power setting/throttle position. With this indication, the pilot gets 
immediate feedback about his current climb performance and is able to climb at 
lower power settings while still being sure to clear the terrain. The result is an 
ecological baseline display showing the causal constraints of the terrain avoidance 
task. 

The baseline display is augmented with an intentional layer indicating the 
minimum safe clearance above the terrain (no. 4 in Figure 8.2). This layer is created 
by shifting the synthetic terrain up and drawing it in amber behind the physical 
terrain. In this way, the layer has the same relationship to the flight path vector as 
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the original terrain. If the flight path vector is above the layer, the terrain clearance 
will be at least the required minimum clearance. An additional advantage of adding 
the intentional layer to the display is that it improves distance perception of terrain 
features, something that is very difficult in traditional synthetic vision displays. 
Because the layer has a fixed height, its resulting thickness on the display is an 
indication for the distance to the terrain. Even though the relationship between 
distance and thickness is non-linear, it can be used for a crude estimation of the 
actual distance of terrain features. 

Scenario 

The scenario used for the experiment consisted of an artificial terrain with a number 
of narrowing fjords. The base of the fjords was at sea-level and the tops were 
around 3,000 ft. The pilots were told that they flew into the wrong ljord on their 
way to the airport and are low on fuel by the time they realize their mistake. Each 
experiment run started in one of the predetermined initial locations at an altitude 
below the surrounding ljord tops. From this starting position it was impossible 
to get to the airport without climbing over the terrain. A navigation beacon was 
placed in-between the initial aircraft’s positions and the airport to provide a 
navigation reference that could be reached in three to five minutes eliminating 
the need for long cruise segments to reach the airport. Pilots were instructed to 
navigate the waypoint and keep clear of the terrain in a way they considered safe 
and comfortable. Pilots were not given minimum altitude instructions, but they 
received a map of the area that showed 4,000 ft as the minimum safe altitude for 
the area they were navigating. 

Independent variables 

The experiment used three within-subject variables each having two levels: display 
configuration, scenario difficulty, and aircraft performance, as shown in Table 8.1. 
The display configuration was either a baseline display without the intentional 
layer or a display with the intentional layer visible. Two levels of difficulty were 
used, as illustrated in Figure 8.3. Easy conditions started with enough margin for 
a straight climb toward the beacon. The Hard conditions required immediate full 
power and a maximum performance climb to avoid transgressing the intentional 
layer. 

Two levels of aircraft performance were used. With normal performance, climb 
performance remained almost constant during the first 4,000 ft of the climb. With 
the reduced performance, climb performance severely deteriorated above 2,500 ft. 
The order in which the display configuration was tested was used as a between- 
subject variable. One group of pilots started with eight runs using the baseline 
display and then moved to the augmented display. The other group started with the 
augmented display and moved to the baseline display afterward. 
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Figure 8.3 Scenario difficulty. Easy conditions (left) start with enough 

margin to easily perform a straight climb above the intentional 
layer. Difficult conditions (right) start with the maximum climb 
angle close to the top of the intentional layer 


Table 8.1 Condition overview 


Name 

Difficulty 

Performance 

Display Configuration 

Easy High Performance 

Easy 

High 

Baseline 

Easy 

High 

Intentional 

Hard High Performance 

Hard 

High 

Baseline 

Hard 

High 

Intentional 

Easy Low Performance 

Easy 

Low 

Baseline 

Easy 

Low 

Intentional 

Hard Low Performance 

Hard 

Low 

Baseline 

Hard 

Low 

Intentional 


Dependent measures 

Three objective measures were used to quantify pilot behavior. The minimum 
terrain clearance and number of clearance violations were used as a measure of 
safety. The final altitude was used as a measure of procedural compliance. Next to 
these objective measures, notes were taken during each run describing the pilots’ 
choices. After each run, pilots provided feedback about their strategy and choices. 

Procedure 

The experiment started with a training phase to familiarize the pilots with the 
display, flight controls, and aircraft model. During training, the pilots could 
fly around freely in a training scenario and they were given an explanation 
of the display features. Once the pilots were familiar with the added features, 
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the measurement runs began. No task-specific training was done, only display 
familiarization. 

The pilots were divided into two groups, one started without the intentional 
additions, the other started with the intentional additions enabled. Initially there 
were 18 pilots divided equally between both groups, but during the experiment, two 
pilots failed to complete the experiment. This resulted in nine pilots starting without 
intentional additions and seven pilots starting with intentional additions. Each pilot 
flew eight conditions per block (two difficulty levels, two performance levels, and 
two repetitions). The conditions were randomly distributed based on a Latin square 
matrix to avoid effects based on the condition order. Different Latin squares were 
used for both blocks. At the end of the experiment, 256 samples were collected. 

Before each measurement run started, the pilots were instructed to set the 
throttle to the trim position. Once the run started, the autopilot maintained altitude 
and airspeed for five seconds. During this time, the pilot was asked to observe the 
situation. After the autopilot disconnected, the pilot had to confirm the disconnect 
by pushing a button and navigate the aircraft toward the navigation beacon. Once 
they were sufficiently close to the beacon, the run ended and the pilots provided 
feedback about their strategy during the run. 

At the end of the experiment the pilots were asked to complete questionnaires 
to evaluate the overall experiment. 


Results 

Minimum clearance 

The minimum clearance for all conditions and repetitions is shown in Figure 
8.4. The box plots show that the spread of the minimum clearance decreases and 
the lower bound shifts toward the 1,000 ft line when the intentional constraint is 
shown. Both repetitions of the Easy High Performance condition show quite some 
variation in the data, but with the intentional constraint visible all clearances are 
well above 900 ft, except for one outlier, a pilot who chose to fly at less than 500 
ft but indicated after the run that he felt compelled to increase his clearance in the 
next runs. In the Hard High Performance condition, the spread in clearances is 
large when the intentional constraint is not shown. The lower bound is at 500 ft 
or lower. This changes significantly when intentional constraints are shown, the 
clearances cluster around the top of the intentional layer with the exception of a 
few outliers. In the Easy Low Performance condition, a similar effect is shown, 
clearances are more clustered and generally above 1,000 ft. In the final Hard Low 
Performance condition, the change in spread is less than in the other conditions, 
but there is a clear shift toward the top of the intentional layer. 

To further analyze the objective clearance measure, a repeated measures 
Analysis of Variance (ANOVA) was performed. To simplify the analysis, the 
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Figure 8.4 Minimum clearance per condition and repetition. The 

whiskers indicate the lowest and highest datum within the 1.5 
interquartile range (IQR) of the lower and upper quartile 

results from the repetitions were averaged per condition resulting in eight data 
points per pilot. This assumption should not distort the results too much since the 
majority of the pilots showed a reasonable consistency between the repetitions in 
terms of strategy and minimum clearance. The increase in clearance is confirmed 
with the ANOVA showing a significant effect for the display type (F(l,14)=5.44, 
p < 0.05). No significant interactions were found for the difficulty and performance 
variables with the display type. 

During the post-experiment questionnaire one of the pilots remarked that his 
strategy without intentional layer was impacted by his experience starting with the 
intentional layer enabled. Analyzing the display order did not show a significant 
effect (F(l,14)=0.687, p=0.064) but was close to the p=0.05 significance level. 
Further research with more pilots might reveal an actual influence of the display 
order. 

The only other significant effects were the main effects of the scenario 
complexity (F(l,14)=23.446, p<0.01) and aircraft performance (F(l, 14)=15.332, 
p<0.01) variables. This confirms that the task became more difficult both with 
decreasing performance and with increasing difficulty. 
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Clearance violations 

Figure 8.5 shows the clearance violations at two clearance altitudes, one at the 
intentional clearance of 1,000 ft and one slightly lower at 900 ft to filter out minor 
violations. In the Easy conditions there is a clear distinction between the runs with 
and without the intentional constraint visible. At the 900 ft level, the violations 
drop to one in 32 runs with the intentional constraints visible at both performance 
levels. In the Hard conditions there is also a decrease in violations when the 
intentional constraints are shown, but the difference is smaller than in the Easy 
conditions. 



Figure 8.5 Clearance violations per condition and repetition 

Final altitude 

For the final objective measure, Figure 8.6 shows the final altitude at the end of the 
run for all conditions. With the average terrain height at 3,000 ft, the 4,000 ft level 
indicates the minimum safe altitude above the terrain. The majority of all runs 
end at or above 4,000 ft. However, in the Hard conditions without the intentional 
constraint visible, a number of pilots end up below 4,000 ft. With the intentional 
constraint visible, all pilots except a few outliers have a final altitude above 
4,000 ft and with a much lower spread compared to the runs without intentional 
constraints. 
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Figure 8.6 Final altitude per condition and repetition. The whiskers 

indicate the lowest and highest datum within the 1.5 IQR of 
the lower and upper quartile 

Strategy’ adaptation 

During the experiment, pilots were not instructed to use a specific strategy. This 
resulted in a wide range of different strategies employed during the experiment. 
In order to analyze how pilots’ strategies changed when the intentional constraints 
were shown, all time traces were classified according to the strategy that was used. 
Two main criteria were used: the horizontal trajectory and the combination of 
power and pitch. 

Five general trajectory strategies were used by most pilots: (1) flying straight 
until clear of the terrain; (2) flying directly toward the navigation beacon; (3) orbiting 
until clear of the terrain; (4) turning toward the lowest point of the terrain; and (5) 
flying parallel to the ridge. In terms of power and pitch, three different strategies 
were identified: (1) climbing at full power with the maximum climb angle; (2) 
climbing with enough power to achieve a clear margin above the terrain; and (3) 
climbing with the power required to keep the climb angle above the intentional layer. 

When looking at the actual changes in strategies when the intentional layer 
was visible, no real differences were observed between the High Performance and 
Low Performance conditions. During the Easy scenarios, however, approximately 
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20 percent of the runs resulted in a changed horizontal trajectory, and 29 percent 
switched to a strategy where they used a combination of power and pitch to keep 
the flight path vector just above the intentional layer. Not all trajectory changes 
can be attributed to the addition of the intentional layer. Some runs provided no 
clear indications of the reason for using a different strategy. The changes in engine 
power and path on the other hand are all directly influenced by the addition of 
the intentional layer since pilots actively used the intentional layer to choose an 
engine power setting and climb angle. 

In the Hard scenarios, the way in which strategies changed differs substantially. 
During 40 percent of the runs, pilots changed their trajectory when presented 
with intentional constraints. The changes observed in the Hard scenarios can be 
divided in two broad categories. A number of pilots changed to a strategy where 
they would actively search for the lowest point in the intentional layer and used 
it to direct their aircraft to these low spots. Another group used the intentional 
layer to determine that their previous strategy woidd not work and switched to 
a different strategy. In terms of power settings, all pilots used maximum power 
in combination with the maximum climb angle for all runs and no changes were 
observed in this strategy for the Hard conditions. 

Questionnaire data 

At the end of the experiment each pilot answered four questions about their 
experience with the intentional constraints. Fourteen pilots responded positively 
to the question if the intentional addition makes the terrain avoidance task clearer. 
The main reasons they gave were the increase in situation awareness, better 
perception of height above the terrain, and a reduced mental load when planning 
a terrain avoidance maneuver. One pilot stated that the procedural addition makes 
the task more restrictive. Another pilot, however, indicated that the intentional 
layer was distracting and ignored the additional information. 

A question about whether the intentional layer changed their strategy was 
confirmed by 12 pilots. The majority indicated that the additional information 
presented enables them to quickly see the lowest regions of the terrain and also 
provides immediate information about the engine power required to reach the safe 
altitude. One pilot also noted that he intentionally put his flight path vector in the 
intentional layer because he felt comfortable with a lower clearance and the layer 
gave him a good indication what pitch angle was required to satisfy this clearance. 
Of the pilots answering no, two indicated their strategy remained the same but 
they used the intentional layer as a confirmation of their strategy. 

Every pilot, except the one that ignored the intentional constraints, felt that the 
perceived level of safety increased when the intentional constraints were visible. 
This mainly happened because the safety margin became explicit in the display 
enabling them to directly assess the risk involved. One pilot noted that adding the 
procedural constraints takes the guessing out of flying. The final question whether 
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the pilots considered the procedural additions useful was answered positively by 
all but one pilot and is in line with the answers to the previous questions. 


Discussion 

The main objective of the experiment was to investigate whether visualizing 
intentional constraints in addition to causal constraints helps pilots in making better 
decisions. Better decisions in the context of this experiment meant respecting the 
minimum safe clearance as much as possible. Analysis of the objective clearance 
parameter confirmed this hypothesis. There is a significant increase in minimum 
clearance when comparing the intentional addition to the baseline display. 

To get more insight into this change in clearance, Figure 8.3 presented the 
clearance values of all pilots per condition and Figure 8.4 showed the number of 
pilots that flew below the minimum safe clearance. In the Easy FTigh Performance 
conditions all but two pilots flew above the minimum safe clearance when the 
intentional layer was enabled. One pilot deliberately ignored the layer and accepted 
a clearance of less than 500 ft. The second pilot violated the minimum clearance 
and kept his flight path vector close to the intentional layer, resulting in a very brief 
excursion just below minimum clearance. In the post-run feedback and through 
observations it became clear that with sufficient performance and margin pilots 
will treat the amber intentional constraint as if it is actual terrain and will avoid it. 

In the Flard High Performance conditions, the same strategy surfaced. Pilots 
were more inclined to try to meet the minimum clearance with the intentional 
layer enabled. There were more violations than in the Easy High Performance 
condition, but, except for one, they were all minor violations. By steering into the 
top of the intentional layer pilots could make an informed choice about sacrificing 
a small amount of clearance for a quicker route toward the airport. 

In the low performance conditions, the same trends can be observed as in 
the high performance conditions. The main difficulty in the low performance 
conditions was that the climb performance significantly decreases during the 
climb. A number of pilots failed to note that there was not enough margin between 
the maximum climb performance and the intentional layer, but even in these cases 
the intentional layer showed that they were closer to the top of the minimum 
clearance and could continue in a relatively safe manner. 

An analysis of the data showed that with the baseline display a number of 
pilots flew with less than 500 ft clearance. For two pilots this was a deliberate 
choice, the other pilots were mainly unaware of their actual clearance. Not 
counting the deliberate violations would leave approximately ten instances where 
pilots flew below 500 ft above the terrain without being aware of this. This number 
is only a third of what Borst et al. (2010) reported in a previous experiment with 
an ecological terrain awareness display. The main reason for this difference is 
probably the fact that pilots had more freedom to perform an escape maneuver in 
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this experiment while in the previous experiment pilots were asked to fly straight 
as much as possible. 

The influence of the intentional constraint on the final altitude, shown in Figure 
8.5, is also an interesting result. While the majority of the pilots definitely aimed to 
fly at 4,000 ft, there were a number of pilots that were slightly below this altitude. 
Enabling the intentional layer shifted the final altitude for all but five pilots, above 
4,000 ft. As long as there is even a small part of the intentional layer above the 
horizon, indicating that there was still terrain that would be cleared by less than 
1,000 ft, pilots will have a tendency to climb until the intentional layer is below 
the horizon. Without the intentional layer, some pilots accepted an altitude slightly 
below 4,000 ft. One reason for this could be that once the actual ridge is cleared, 
the urgency of the situation decreases because the majority of the visual feedback 
is gone. With the intentional layer, this information will still be obvious as long as 
it is still relevant, that is, as long as the clearance is less than 1,000 ft. 

The way in which pilots adapted their strategy when presented with intentional 
constraints depended on the difficulty of the scenario. In the Easy conditions, the 
most obvious adaptation was when pilots directly used the intentional layer to 
determine the required engine power setting and corresponding climb angle to clear 
the terrain with the expected margin while preserving fuel as much as possible. In 
other instances, pilots used the added information to confirm their strategy. While 
doing the same as in the corresponding baseline conditions, a number of pilots 
indicated after the run that they felt more confident and were more aware of the 
safety margin. Finally a number of pilots did not change their strategy, but used 
the intentional layer to fine tune it. For example, a number of pilots that performed 
a straight climb to 4,000 ft before making the turn toward the navigation beacon 
followed the same strategy but used the intentional layer as a cue to decide when 
to initiate the turn toward the navigation beacon. 

In the Flard conditions, there was less leeway for the pilots, the scenarios forced 
pilots to quickly use all available performance. In these situations, the intentional 
layer in combinations with the indication of the maximum climb performance 
gave the pilots an instant assessment of their options. With this information, a 
number of pilots decided to change their strategy and for example make a few 
climbing turns. Other pilots stuck with their strategy and used the intentional layer 
to maximize their clearance by keeping their flight path vector out of the intentional 
layer or at least in the top part. By doing this, they sacrificed some clearance, but 
they were able to clearly assess the severity of this violation. Finally, a number of 
pilots used the intentional layer to guide their flight path toward the lowest parts of 
the intentional layer keeping their clearance close or above 1,000 ft. 

Through the observations, post-run feedback of the pilots, and the questionnaire 
it became clear that the majority of the pilots used the intentional addition to either 
improve or change their strategy in solving the terrain conflict. The way in which 
they fit the intentional addition into their strategy can differ but they all indicated 
that it enhanced their analysis and awareness of the task at hand. 
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One drawback of the current intentional representation surfaced during the 
experiment. Once a pilot flies below the minimum clearance, the whole top part 
of the display is filled with the amber color. Once this happens, it is no longer 
possible to directly perceive the difference between a minor violation close to 
the top of the intentional layer or a dangerous violation close to the terrain. In 
the future this could be resolved by using different shades of amber to indicate 
different clearance levels in the intentional layer. 

Looking back at the hypothesis, this experiment has provided some insights 
into the usefulness of visualizing intentional constraints. The freedom given to 
the pilots to implement their own preferred solution resulted in the adoption of 
a wide variety of strategies, but the majority used the additional information to 
their advantage. Some used it to verify their strategy, some to fine tune, and some 
used it to completely change their strategy. All of this is based on the visualization 
of the intentional constraint. Explicitly showing information that is otherwise 
presented through different modalities—charts, knowledge, other instruments— 
enables pilots to focus on immediate use of this information instead of mentally 
piecing together all the pieces. 

In line with this, pilots were able to use strategies that are otherwise impossible 
without elaborate calculations. As an example, some pilots decided to accept a 
small clearance violation in order to fly a shorter route. They would put their flight 
path vector in the intentional layer, but close to the top. In this way they could be 
certain that they would still have a sufficient margin above the terrain. Without the 
intentional layer, these types of strategies would require the pilots to accurately 
determine their position with the help of a map and a navigation beacon, calculate 
the climb performance and check if the performance is sufficient. These types 
of calculations quickly become time consuming and are not feasible when quick 
decision making is required. 

Finally the experiment showed that the current representation subtly drives 
pilots toward the intended margin. The pilots were not explicitly instructed to 
never fly into the intentional layer, they were only instructed on how the display 
worked and the kind of information presented to them. Even without explicit 
instructions, pilots showed a natural tendency to stay above the intentional layer 
if this was feasible. 


Conclusion 

The main objective of the experiment was to investigate whether the effect of 
visualizing intentional constraints in addition to causal constraints helps pilots 
in making better and more robust decisions. Better decisions in the context of 
this experiment meant respecting the minimum safe altitude as much as possible. 
Analysis of the objective clearance parameter confirmed this hypothesis. There 
was a significant increase in minimum clearance when comparing the intentional 
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addition to the baseline display that only portrayed causal constraints. Visualizing 
the minimum safe altitude residted in better compliance with the intentional 
constraint and resulted in the pilots being able to make better tradeoffs between 
performance and safety, confirming our hypothesis. 

Based on the experiment results, we can conclude that incorporating intentional 
constraints can shape operators’ behavior and can shift them from operating close 
to the physical constraints toward a point closer to the intentional boundaries. 
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The importance of teams or crew communication in aviation has long been recognized. 
Communication among members of the crew is critical in the timely execution of 
procedures, shared situation awareness, and effective decision making (Kanki, Folk 
& Irwin, 1991; Kanki & Foushee, 1989). Failures in communication have been 
cited as causal factors in aviation accidents. In fact, Wiegman and Shappell (2001) 
classified commercial aviation accidents between 1990 and 1996 according to their 
Human Factors Analysis and Classification System (HFACS) scheme and found 
that nearly 30 percent of them were associated with crew resource mismanagement. 
Indeed, Crew Resource Management training has been developed to address the 
need for effective communication and other forms of crew interactions. 

Crew coordination is related to communication in that it involves the timely and 
effective passing of information from one team member to another which is often 
accomplished through explicit communication (Gorman, Amazeen & Cooke, 2010). 
For effective coordination, timing is of the essence and timing mistakes can mean 
the difference between effective and ineffective team performance. For example, 
communication about the presence of another aircraft on a potential collision course 
with one’s own aircraft is communication that is needed sooner rather than later. 

Together, communication and coordination are important cognitive processes 
that happen at the team level (Cooke, Gorman, Myers & Duran, 2013). These 
cognitive processes can be trained and can be facilitated or hampered by technology. 
In this chapter we focus on the increasing use of text chat in some aviation contexts 
and its impact on the cognitive processing of the crew (Cummings, 2004; Heacox, 
Moore, Morrison & Yturralde, 2004). Although voice is still a preferred means of 
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communication in civil and commercial aviation, in the area of unmanned aerial 
system ground control, text chat has become the preferred mode of communication 
(Hamilton, Cooke, Brittain & Sepulveda, 2013). 

Unmanned Aerial Systems (UAS) are not unmanned at all but are remotely 
operated by dozens of people on the ground (Cooke, Pringle, Pedersen & Connor, 
2006). There are differences in platforms and concomitant operating procedures. 
In general, individuals on the ground have different roles that include the pilot or 
air vehicle operator (AVO), the sensor or payload operator (PLO), and the mission 
planner or navigator (DEMPC). These individuals are in tight communication with 
each other and with others who play a role in the larger system such as intelligence 
analysts, sensor data exploiters, mission command, weather personnel, and 
maintenance personnel. As is the case for manned aviation, a significant portion 
of mishaps can be attributed to poor crew communication and coordination 
(Tyvaryanas, Thompson & Constable, 2006). 

In particular, much crew coordination is required at target waypoint (for 
instance, locations that need to be photographed). In a laboratory UAS simulator 
of one of the authors, coordination required to photograph a target requires: 

1. information about the target to be passed by the mission planned to the pilot; 

2. a negotiation between the pilot and sensor operator regarding the mapping 
of camera settings to the altitude and airspeed of the vehicle; and 

3. feedback from the sensor operator to other team members that a good 
picture has been taken. 

This type of coordination has been modeled as a dynamical system and patterns 
of coordination that unfold over time have been shown to be related to effective 
performance (Gorman et al., 2010). 

In the laboratory paradigm described above, communication occurred using 
voice over an intercom equipped with push-to-talk buttons. Operators wore 
headsets and pushed and held buttons corresponding to the teammate or teammates 
they wished to talk with. In recent studies, we have incorporated text chat in the 
lab as a form of communication. This mode is an increasingly preferred mode of 
communication in military UAS operations and indeed in other military settings 
such as the Air Operations Center (Weil, Duchon, Duran, Cooke & Winner, 2008) 
where timing is also of the essence (for example, time-sensitive targeting). Further, 
using text chat modes facilitates interactions with synthetic teammates (Ball et 
al., 2010), In these settings, in fact, text chat occurs in a set of chat windows, 
each window dedicated to a particular function. In most cases operators interact 
through multiple windows and in some cases may have five chat windows (or 
more) open at a time (Hamilton, Brittain, Cooke & Sepulveda, 2013). 

Given the increasing use of text chat in these settings, it is important to understand 
the implications of this mode of communication for crew communication and 
coordination. We therefore ask in this chapter what the effects of communication 
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using text chat are for team effectiveness compared to voice communications? We 
address this problem in three ways: 

1. an analysis of electronic text chat corpus to define features of text- 
based communication; 

2. an empirical study of text versus voice communication; and 

3. an agent-based model that compares text chat with voice communications. 


Features of Text-based Communication 

Given that the nature of communication is changing from a solely spoken form 
to a mixture of text and spoken communications, one must consider the features 
of text-based communications. Such features include increased communication 
duration, terse messages, dialogue that is asynchronous in nature, decreased 
transience relative to spoken communications, and an impoverished message 
relative to spoken communications (for example, intonation and so on). 

Information disseminated through text takes longer to communicate than spoken 
information due in part to the time taken to produce the information in textual form 
and the time taken to read and interpret the text. Further, text communications are 
naturally asynchronous. With text communications one can read the information 
at any time the information becomes available for consumption (that is, arrives in 
chat window; arrives in email inbox, and so on). However, spoken communication 
is often a real-time activity; information is received as it is communicated (though 
there are exceptions such as recorded messages). 

Text communications are intransient relative to spoken ones. For example, a 
letter can be read multiple times, text in chat windows and emails can be searched 
for keywords for important, but forgotten, information. Even though information 
persists longer in text communications, the messages are less rich than spoken 
communications (Lengel & Daft, 1988). For example, tone-of-voice is often 
difficult to convey or understand in text communications, but quickly understood 
via spoken communications. 

Text-communication features can be thought of as constraints on performing any 
task requiring communications. Text-communication constraints may be aligned 
with spoken communications, but they may not be either. Consequently, there 
may be better tasks for using text communications than spoken communications. 
The reverse is also true. For example, in time-critical domains that require multi- 
component coordination to achieve a goal, text communications may slow down 
the coordination process to such a degree that goal achievement is impossible. 
Also, voice communication has been found to be superior to text communication 
when conveying spatial information (Fu, D’Andrea & Bertel, 2013). However, 
if the communication clarity (in terms of transmission clarity and clarity of the 
communicated information) and communication transience (how long the message 
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remains available to the receiver) is of high importance, such as communicating 
a target for elimination, then text communications may supersede spoken forms. 

Analysis of chat corpus 

In this section we analyze text communications collected in an UAS simulated 
task environment to determine if there is any regularity in the information 
communicated. The task requires three teammates (pilot, AVO; navigator, 
DEMPC; photographer, PLO) to communicate with one another to successfully 
photograph reconnaissance targets. In this study, the PLO and DEMPC were told 
that the AVO was human and working at a different location (that is, offsite) or that 
the AVO was a synthetic agent (that is, synthetic). Teams performed five 40-minute 
missions. Given the structure of the task, it is likely that there is communication 
overlap across teams. However, this overlap may be small due to the ability for 
participants to communicate as they see fit, as long as it is text-based. 

The number of communications sent among the PLO, DEMPC, and AVO 
totaled 11,625 messages. Teams that were instructed that the pilot was located in a 
different location sent fewer messages, on average (M= 104; SD = 47) compared 
to teams that were instructed that the pilot was a synthetic teammate (M =111; 
SD = 36), though this difference is not statistically reliable. Further, the number of 
messages tended to increase across missions, though nonsignificantly (see Figure 
9.1), The proportion of sent messages differed based on the role one played on the 
team (for example, the photographer, navigator, and pilot), but did not seem to 
differ whether the participant in a particular role was informed that the pilot was a 
synthetic entity or was operating at a different location from the rest of the team. 
These results demonstrate that the number of text communications change with 
experience and with different teaming conditions. 

Given the nature of the task and the ability for participants to choose how they 
communicate, information could be conveyed in an infinite number of ways. For 
example, if the navigator wanted to share information about a waypoint and its 
effective radius size, altitude, and speed restrictions, one way could be: “LVN has 
an er of 5, a speed rest of 200, an alt rest of 1500,” whereas the effective radius 
and restrictions of another waypoint could be communicated with a message like: 
“H-AREA/er=2.5/s=100/alt=2000.” In order to handle some of the variation in an 
analysis of text communications, we used regular expressions to regularize aspects 
of the communications that would differ across messages based on mission-relevant 
information (for example, waypoint names, restriction values, and so on) while 
preserving the structure of the message. Thus, the first message would be changed 
to the form “_Wpt has an _Radius of _N, a Speed Restriction of _N, an _Altitude 
_Restriction of V” and the second example would be changed into the form “ _Wpt/_ 
Radius Equals N/Speed _Equals N/Altitude _Equals _N.” This provided a means 
for regularizing and grouping messages to determine message-type prevalence. 

Using this approach, there were 7,527 groups of regularized messages, 
reducing the corpus by 35 percent. Regularized message groups that contained 
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Figure 9.1 Text messages sent over the course of an experiment. Shading 
represents the standard error of the mean; dark shading 
represents overlap of SE for the offsite and synthetic conditions 

10+ instances of that message token comprised 2,572 of the total number of 
messages (11,625). Thus, about 22 percent of the messages sent had the same 
structure 10+ times. Regularized message groups that contained more than one 
instance (2+) comprised 4,992 of the total number of messages (~43 percent of all 
messages had a structure used more than once). Thus, ~57 percent of all messages 
sent had a unique message structure. These results simultaneously demonstrate the 
variable nature of communications in text-based systems as well as the potential 
for reusing message structures/grammars. 

When only considering the top ten message structures used in this study, nearly 
all of the messages were task-relevant, yet some occurred significantly more than 
others. For example, the most popular message, an affirmation (that is, Affirm ; 
“Copy,” “Yes,” and so on), occurred 547 times within the corpus, with 31 different 
ways of affirming information, where “Roger” was the preferred with 106 
instances followed by “ok” with 105 instances. The second most popular message 
were specific waypoint names, which occurred only 148 times (see Figure 9.2). 

Summary 

There is considerable variation in the text communications we collected and 
analyzed even though they occurred in the context of a well-specified and structured 
simulated task environment. Less than half of the unique messages were repeated 
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Figure 9.2 Top ten message types broken out by condition 

at all. This is somewhat surprising within a relatively well-structured task. It 
is quite possible that reducing message variability may improve performance 
through disambiguation. Alternatively, it may impede performance by requiring 
individuals to force desired communications into pre-specified structures. 

Given the slow rate of information dissemination over text communications 
relative to voice communications, it would make sense to suggest that individuals 
minimize the amount of typing to communicate important information. One 
suggestion would be to take advantage of abbreviations and punctuation. A 
potential drawback of using abbreviations and punctuation is the insertion of 
unintended ambiguities. 


Text Chat versus Voice Experiment 

The research presented here took place as part of a larger project with the Air Force 
Research Laboratory that replaces a human UAS pilot with a cognitively plausible 
computational model that serves as a full-fledged synthetic teammate for a three- 
agent UAS ground control crew. Not only is the extension of the ACT-R cognitive 
modeling architecture of interest (Ball et ah, 2010), but the larger project will 
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address questions about team coordination: What is the nature of coordination and 
collaboration (within all-human or mixed human-synthetic teams) in UAS ground 
control settings and what do deficiencies in synthetic teammate interactions with 
human teammates reveal about human-automation coordination needs? 

Prior to inserting the synthetic teammate into the loop with two human 
participants, an experiment with all-human teams was conducted to establish 
baselines and is reported here. To establish a baseline for a new text chat mode of 
communication, team performance and coordination was examined using text chat 
communication and compared to voice communication. Because text chat is not a 
transient signal like voice and because communications can occur asynchronously, 
there is a possibility that coordination among teammates using text chat will be 
altered. Specifically, coordination should be impacted by the asynchronous nature 
of communication. It is unclear whether performance will be affected, but if 
coordination is made more difficult, performance is also likely to be negatively 
impacted in this task. We conducted an experiment to address questions about 
coordination and text chat. 


Team Coordination in the Cognitive Engineering Research on Team 
Tasks Unmanned Aerial System—Synthetic Task Environment 
(CERTT UAS—STE) 

This experiment was conducted in the context of the CERTT UAS—STE 
(Cognitive Engineering Research on Team Tasks Unmanned Aerial System— 
Synthetic Task Environment (Cooke & Shope, 2005). The UAS—STE is based on 
the United States Air Force Predator UAS ground control station. The UAS-STE 
task requires a team of three people to complete the task of photographing critical 
waypoints. Each team member was assigned to one of three roles: an AVO, a PLO, 
or a DEMPC. The DEMPC plans a mission route through multiple waypoints, the 
AVO is responsible for flying the simulated UAS and monitoring UAS systems, 
and the PLO takes photographs of designated waypoints and monitors camera 
systems. The roles are interdependent, where each role requires input from other 
team members to complete the team’s goal of photographing designated waypoints. 
Further, the CERTT UAS—STE is dynamic and taking good photographs of 
designated waypoints requires information to be shared among teammates in a 
timely manner. A single UAS—STE mission consists of 11-12 targets and lasts a 
maximum of 40 minutes; each team performs five 40-minute missions. 

Over a decade of research conducted in the CERTT UAS-STE has indicated 
that team interaction in the form of coordinated information passing and 
communication is important for predicting team performance and has led to a theory 
of Interactive Team Cognition (Cooke et al., 2013). In particular, coordination 
is based on the timely sending and receiving of information required for taking 
good photographs of designated waypoints. A coordination score (A) is based on 
the timing and sequence with which key pieces of information are communicated 


148 


Advances in Aviation Psychology 


among teammates (Gorman et al., 2010). The coordination score (K) is computed 
as the amount of time from when information (7) about waypoint w is passed from 
the DEMPC to the team to when feedback (F) about taking a good photograph 
of waypoint w is provided to the team from the PLO. This is then divided by the 
amount of time from when the PLO and AVO negotiate (N) UAS flight dynamics 
for waypoint w up to when the PLO provides feedback (F) that a good photograph 
was taken for the waypoint w. 


Fw — /", 

K = - 

Fw-Nw 

The purpose of this study was to collect baseline data in the context of the 
CERTT UAS—STE task with all human teams communicating via text chat, the 
mode of communication that will be used with the synthetic teammate. This mode 
was compared to voice communications, used in previous studies. Also, given the 
preponderance of text-based communications in our society and its adoption in 
time-critical military and civilian contexts, the comparison of text versus voice 
as modes of communication is relevant and of increasing importance. By many 
accounts (Baltes, Dickson, Sherman, Bauer & LaGanke, 2002; Weeks, Kelly & 
Chapanis, 1974) the use of text chat may not be the best mode of communication 
in time-pressured circumstances. The purpose of the experiment was to investigate 
how text-based communications affect team performance and coordination 
within the UAS—STE. Based on previous research, we hypothesized that teams 
communicating with text would coordinate differently from teams communicating 
using voice and that teams communicating with voice would perform the task 
better than those using text. 

Method 

Participants 

Twenty, three-person teams comprised of college students and the general 
population of the Mesa, Arizona area voluntarily participated in one 6.5-hour 
session. Individuals were compensated for their participation by payment of 
$10.00 per hour with each of the three teammembers on the highest performing 
team receiving a $100.00 bonus. The majority of the participants were males, 
representing 75.9 percent of the sample. Individuals were randomly assigned to 
either a voice or text chat communication condition. The participants were also 
randomly assigned to teams and to one of three roles. All members of teams were 
unfamiliar with each other when they arrived for their sessions. 

Equipment and materials 

The experiment took place in the CERTT Laboratory configured for the UAS— 
STE (described earlier). Participants in the text chat condition communicated 
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using the keyboard and a custom-built text communications system designed to 
log speaker identity and time information. The text communications interface 
was divided into three separate modules. The receiver module alerted participants 
with a lighted button when a message from another team member was sent. 
The receiver module also allowed participants to read incoming messages by 
pressing and holding the F10 key. On releasing the F10 key, the message was then 
displayed in the storage module, which was comprised of a window that contained 
previously received messages in a list. Participants were given the ability to scroll 
through the messages by pressing the F7 and F8 keys. Participants sent messages 
with the transmit module. To send messages, participants first typed their message 
in the transmit module window, selected the recipient using the F3, F4, and F5 
keys, and then pressed FI to send. The interface enabled participants to select 
multiple recipients. Each message was time stamped with when it was sent (FI 
key-presses) and when it was received (F10 key-presses) in order to compute 
coordination scores ( K) and dynamics. Participants in the voice communications 
condition communicated with each other and the experimenter using David Clark 
headsets and a custom-built intercom system designed to log speaker identity and 
time information. The intercom enabled participants to select one or more listeners 
by pressing push-to-talk buttons. 

Custom software (seven applications connected over a local area network) 
ran the synthetic task and collected values of various parameters that were used 
as input by performance-scoring software. A series of tutorials were designed 
in PowerPoint for training the three team members. Custom software was also 
developed to conduct tests on information in PowerPoint tutorials, to collect 
individual taskwork relatedness ratings, to collect workload and situation 
awareness ratings, to administer knowledge questions, and to collect demographic 
and preference data at the time of debriefing. This report will focus on performance 
and coordination data. 

Procedure 

The experiment consisted of one seven-hour session. The AVO was located in 
a separate room adjacent to the other members (DEMPC and PLO). The AVO 
entered the building through a separate entrance located on the opposite side of 
the building, and was not allowed to have contact with the other members until 
debriefing. In the session, the team members were seated at their workstations 
where they signed a consent form, were given a brief overview of the study, and 
started training on the task. 

The number of targets varied from mission to mission in accordance with the 
introduction of situation-awareness roadblocks at set times within each mission. 
Missions were completed either at the end of a 40-minute interval or when team 
members believed that the mission goals had been completed. Following each 
mission, participants were given the opportunity to view their team score, their own 
individual score, and the individual scores of their teammates. The performance 
scores were displayed on each participant’s computer and shown in comparison to 
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the mean scores achieved by all other teams (or roles) who had participated in the 
experiment up to that point 

Results 


Team performance 

Team performance was measured using a composite score based on the result 
of mission variables including time each individual spent in an alarm state, 
time each individual spent in a warning state, rate with which critical waypoints 
were acquired, and the rate with which targets were successfully photographed. 
Penalty points for each of these components were weighted a priori in accord 
with importance to the task and subtracted from a maximum score of 1,000. Team 
performance data were collected for each of the five missions. 
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Figure 9.3 Team performance means for each mission differed over 
missions, but not condition 

Team performance was analyzed using a 2 (text, voice) x 4 (mission) mixed 
ANOVA. Each communication condition (text, voice) had ten teams. There 
was a main effect of mission F(3, 54) = 9.447, p < .001. Teams improved their 
performance score across the first four missions. There were no significant effects 
of communication condition, F( 1, 18) = 0.57, p < 0.46, although the voice teams 
consistently had higher performance scores across all missions than teams in the 
text chat condition (see Figure 9.3). 
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Least Significant Difference (LSD) pairwise comparisons showed that team 
performance improved over the course of the first four missions, with significant 
gains between the first two missions (p = .005) and between the second and fourth 
missions (p = .015). 

Coordination 

Based on the inherent time costs of using text chat (for example, typing, noticing 
a message arrived, and so on), there was a significant time lag between when a 
message was sent and when it was received (M = 10.5 s for text; 0 s for voice). To 
determine if there was a difference in coordination score between voice and text 
chat, a 2 (communication mode) x 4 (lower workload missions) mixed ANOVA 
was conducted on coordination scores. There was a significant main effect for 
which text chat had a significantly lower coordination score than voice (p = 
0.042). This is not to say that the voice condition coordinated "better," but only 
to say that the two communication conditions coordinated differently. Further, 
a measure that reveals the stability of team coordination dynamics, the Hurst 
exponent (Treffner & Kelso, 1999), was also analyzed to determine if there was a 
coordination stability difference between communication groups. An independent 
samples t-test on the average Hurst exponents across teams revealed that text chat 
teams were, on average, coordinated in a more stable fashion (M= 0.9527, SD = 
0.0131) than voice teams (M= 0.8988, SD = 0.061), t(15) = 2.287, p = 0.037. Thus 
we can conclude that the patterns are different, but degree of stability (without 
context) does not support a relative value judgment. 

For the four low-workload missions the median of the performance scores 
was 310 in the chat condition. A regression analysis on all the teams combined 
revealed that the linear trend between communication lag and team performance 
was significant (F (1, 38) = 9.06; p = 0.005) indicating that as lag decreased, 
performance increased. Regression analyses also revealed a positive linear 
relationship between performance score and Kappa in teams performing above 
the median performance score (,F(1, 13) = 4.46; p = 0.055). Overall these results 
indicate that text chat results in different coordination patterns than voice chat and 
that there is a relationship between these patterns and team performance. 


Agent-based Modeling of Text Chat versus Voice 

An agent-based model is a type of computational model though which agents 
can represent humans with unique cognitive and behavioral characteristics. Rules 
of agent interaction with each other can also be assigned to study macro-level 
processes such as team cognition that emerges from microlevel interactions that 
are modeled (Grimm & Railsback, 2005). Agent-based modeling can be used 
as a complimentary approach to laboratory experiments for understanding team 
coordination and communication. An agent-based model can be used to extend 
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laboratory experiments to predict performance in large teams which is difficult 
and expensive to study in the lab. Also, an agent-based model can be used to ask 
questions that are hard to ask and recreate with human participants such as the 
impact of individual team member characteristics on team effectiveness. 

An agent-based model of the UAS ground control task was used to extend 
the text versus voice experiment and examine the effect of using multiple chat 
windows on team performance. The model has three human agents (depicted as 
green, blue, and black human figures), one UAS agent (depicted as yellow airplane 
figure) and 12 targets to survey and photograph (depicted as red-colored cross 
marks). Of the three human agents, one represented a DEMPC, one represented 
an AVO, and one represented a PLO. The location of the 12 targets was made 
available only to the DEMPC agent. The AVO was assigned the responsibility of 
maneuvering the UAS and the PLO was given information about the appropriate 
position to take the photograph for each target. The DEMPC agent sent coordinates 
to be surveyed to the AVO agent either initially when the simulation began or only 
when it received a feedback from PLO that the last target had been photographed. 
The AVO used that information from the DEMPC to maneuver the UAS to the 
specified target. On reaching the target, the AVO informed the PLO. The PLO 
looked at the position of the UAS with respect to the target and determined if it 
was the appropriate position to take a photograph of the target. If the position of 
the UAS was not appropriate, the PLO informed the AVO about the appropriate 
position and the AVO moved the UAS accordingly. Once the PLO and AVO had 
agreed on parameters, the PLO marked the target as photographed and informed 
the DEMPC and the simulation moved on to the next target. 

Each agent used a memory variable to hold information sent by another agent. 
The three human agents had been modeled to have memory decay with the rate of 
decay “d” determined by the experimental condition. The human agents in the voice 
condition had high memory decay (d = 2.5) as compared to agents in the text condition 
for which the memory decay was low (d =1.5), simulating the persistent nature of text 
over voice. The rate of decay determined memory activation and when the memory 
activation was below a low threshold the agent forgot about the alert or information 
sent by the other agent. Activation (A) was calculated using the following formula: 

Activation A = - 

(current simulation time - simulation time in memory) A d 

The d represents the decay factor. This formula for activation of agent’s memory 
is loosely based on ACT-R formula for activation (Anderson & Lebiere, 1998). 

in the voice condition, the agents were modeled to produce a lag in responding to 
an alert received from other agents and this lag was a random duration between zero 
and five simulation ticks. Similarly in the “text” condition there was a lag in response 
between seven and 12 simulation ticks. The lag was modeled as a composite of lag 
in encoding the received information in memory plus lag in responding to another 
agent after activating the memory. Therefore the lag for voice condition started at 
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zero because there would not be any lag in encoding to memory, but there would 
be a lag in responding. In the text condition there would be a lag in both encoding 
and in responding to other agent. The range of lag to respond in both conditions was 
set to be the same (zero to five ticks), however the lag to encode was set higher for 
agents using text chat and was set lower for agents using voice. The lag was also 
multiplied by a random number between one and the number of chat windows. So if 
the number of chat windows was three the lag was multiplied by a random number 
between one and three where the lag was multiplied by a maximum of three or 
multiplied by a minimum of one. This was a reasonable assumption because there 
would be no extra delay due to other windows or there would be a delay due to all 
three windows. All agents were also modeled to expect an acknowledgement from 
the other agent when they alert or send information. When the receiver agent forgot 
about the alert the sender agent resends the information after a certain amount of 
time (12 simulation ticks). The agents were restricted to forget up to three times 
after which they had to acknowledge and carry out their tasks. In summary, agents 
in voice condition were modeled to have a low lag time to encode, but high memory 
decay. The agents in the text condition were modeled to have a high lag time to 
encode, but low memory decay. 

The model was executed for 100 repetitions with both voice and text as 
conditions and with from one to four chat windows for the text condition. As 
shown in Figure 9.4, is the model indicates a gradual increase between conditions 
in time taken to plan, maneuver, and photograph all the 12 targets with agents in 
voice condition completing the mission quickest and agents in the text condition 
with four chat windows taking the most time to complete the mission. 
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Figure 9.4 Plot on condition against time taken to complete the mission 
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Interestingly, there is not much difference between text and voice with one 
chat window which mirrors our empirical findings in which team coordination 
was altered by the use of chat and performance was negatively impacted for text, 
but not significantly so. But when additional chat windows are used the negative 
effects of using text chat are predicted by the model to be much greater. It takes 
over twice as long to process a target when two windows are open and over six 
times as long to process a target when four windows are open. This may present 
a significant problem when using multiple windows of text chat in time-sensitive 
targeting missions, as is the practice in many military command and control 
settings. Future research should examine the validity of this prediction. 


Conclusion 

In many military aviation contexts electronic text chat has become the primary 
communication mode, with voice communications being used less often. This is 
occurring in a climate in which electronic texting on mobile devices is a common 
form of interpersonal communication. It is obvious that there are differences 
between voice and text communications, but the implications of these differences 
for communication in the aviation context, and especially in contexts that are 
of a time-sensitive nature, is not clear. This chapter described a three-pronged 
approach to addressing these implications. 

First we have presented an analysis of a text-chat corpus collected in the 
CERTT UAS—STE three-person simulator. Importantly, text differs from voice in 
its persistence, which affords certain advantages when communicating details that 
may escape memory. However text is also asynchronous with a mean encoding 
lag of 10.5 seconds in the UAS—STE. There are also ambiguities in text chat, just 
as there are in voice communications and surprisingly we identified a tremendous 
amount of variation in the construction of text messages, even in a well-structured 
task environment. It may be possible to reduce ambiguities (for machine or 
human receiver) in text chat by constraining the chat to particular formalisms or 
abbreviations much like there are standards for voice communication, (that is, 
brevity code). However, such constraints may cause additional overhead, which 
further disrupts team performance. The effect of these ambiguities on team 
coordination and performance is an open question, as is the possible development 
of increasing regularity as a team works together over time. 

Second, we have conducted an empirical comparison of text chat and voice. 
From this study we found that text chat changes the coordination patterns of a 
team and this coordination is in some cases, related to team performance. Overall 
team effectiveness is greater when communicating using voice versus text chat, 
however that difference was not statistically reliable. 

The third approach to this problem was through agent-based modeling. Here 
we developed a model of agent coordination and then extended that model to the 
case of multiple windows. The use of multiple windows is also a characteristic 


Implications of Text Chat for Air Crew Communication and Coordination 155 


of the use of chat in military operations. The model suggests that there is an 
exponential impact of multiple chat windows which could increase time to process 
the communications, thereby negatively impacting team performance. 

Overall, the use of text chat for time-sensitive operations may be imposing a 
coordination handicap on a situation in which timely coordination is paramount. 
The use of multiple windows is predicted to exacerbate the problem. Allowing 
operators to select chat or voice communications, as is typical in current 
operational environments, raises the question of when to use which mode and 
how to most effectively combine the two modes. Though electronic text chat may 
provide persistent records that benefit certain types of operations, its use in time- 
sensitive applications should be carefully considered with timing disadvantages 
weighed against advantages in persistence. Further, text chat standards may help 
to clarify communication, but it will not remove the encoding lag. 
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Chapter 10 

Human-centered Automation as Effective 
Work Design 

Amy R. Pritchett & Karen M. Feigh 
Georgia Institute of Technology, USA 


This chapter describes how the challenge of human-centered automation can 
be recast as the challenge of, first, designing the work performed by a team of 
agents and then, second, allocating this work amongst all the agents, human and 
automated. The chapter starts by examining teams, and identifying where inclusion 
of automated agents adds new requirements to the work of teams. The chapter 
then formally describes work as a construct which can be formally analyzed 
and designed to create effective human-automation interaction. An example 
highlights key tradeoffs in designing and allocating work in teams of human and 
automated agents: no one allocation can maximize all the desired attributes of 
human-centered automation. 


Introduction 

Work may be generally defined as “effort directed to some purpose or end.” Thus, 
it is purposeful activity directed at goals established by the mission or a concept of 
operation. Further, work must be examined from an ecological perspective: work 
is achieved by acting on a dynamic environment in response to its demands. This 
environment describes, constrains, regulates, and structures the dynamics of the 
work; thus, the environment may have inherent dynamics which agents' actions 
need to mirror, may provide affordances which need to be sensed and capitalized 
upon, and may constrain agents' actions. 

Here, we view work as a construct applied at the level of a team of one or more 
humans interacting with each other and with one or more automated agents. Thus, 
the overall taskwork that needs to happen, and its overall structure and dynamics, 
is driven by the team goals and by the environment. The taskwork emerges 
out of the collective behavior of all agents in the team, human and automated, 
even when some of the agents may not see how their activities contribute. The 
allocation of functions within the team creates the additional need for teamwork. 
This teamwork also expands each individual’s perception of the environment to 
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include both relevant parts of the taskwork environment and the teamwork aspects 
for interacting with his/her team members. 

The team’s work environment includes not only a physical structure but also, 
in many domains, a procedural structure established as a safe nominal course of 
action and as the basis for coordination between distal agents. For example, a 
pilot’s operating environment includes standard operating procedures, reflected 
physically by cockpit features referred to by name. Likewise, procedures provide 
a structure for coordination between distal individuals whose interaction may need 
to be brief yet reach a quick resolution, such as between a pilot newly entering 
the sector of an ATC Controller. Like many physical constraints (for example, 
limits on operating temperature), procedures may be disregarded in off-nominal 
conditions; however, their influence in nominal operations, through social and 
regulatory mechanisms, cannot be ignored. 

For a given goal and work environment, fundamental properties of the team’s 
collective taskwork can be identified through analysis of the environment’s 
affordances and constraints (for example, Naikar, Pearce, Drumm & Sanderson, 
2003). For example, the goal of flying a constant heading and altitude requires 
work (instrument scans and control inputs) mirroring the dynamic response of 
the aircraft. However, while the collective work of the team can be deduced from 
engineering analysis, the work of individuals within the team can only be fully 
defined, and are specific to, the function allocation within the team. 

From this viewpoint, two things may be designed: the concept of operation 
defining the goals and structure of the overall task work, and specification of 
teamwork, starting with the function allocation. The concept of operations 
describes what the team as a whole must produce; and it is constrained by key 
structures in the environment which the work needs to mirror. The specification 
of teamwork then examines the individual agents, seeking to allocate functions 
and identify constructs for effective teamwork, including human-automation 
interaction. 

We further refine this viewpoint to specifically note how allocating functions 
to automation also changes the behavior of its human teammates. This viewpoint 
builds on team design and computational organization theory, and thus this chapter 
starts by reviewing their insights. Specifically, while these domains examine 
team structure, team communication and collaboration, the impact of trust, and 
cognitive and social behaviors, they have not considered the unique concerns with 
combining teams of human and automated actors beyond some consideration of 
intelligent software agents. The chapter then models the work of teams of human 
and automated agents, including specific concerns with allocating functions to 
automation. An example highlights key tradeoffs in designing and allocating work 
in teams of human and automated agents: this chapter ends with the assertion 
that no one design can maximize all the desired attributes of human-centered 
automation. 
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Considerations in Function Allocation in Teams of Humans and 
Automated Agents 

How well does automation fit into the definitions, models, and metrics used in 
organization and team design? While some definitions of automation focus more 
on control automation capable of executing functions (for example, Sheridan, 
1992, p. 3), others focus on “a machine capable of performing functions commonly 
attributed to humans” (for example, Parasuraman & Riley, 1997, p. 231), clearly 
describing automation in terms of function allocation relative to humans, 
including cognitive functions. Within this definition, automation includes alerting 
systems, intelligent displays and automatic judges, and decision and planning aids 
(for example, Bass & Pritchett, 2008; Parasuraman, Sheridan & Wickens, 2000; 
Pritchett, 2005; Wiener & Curry, 1980). 

Furthermore, several automation studies have gone further in describing 
automation as a team member. For example, Muir (1987, 1994) related models and 
measures of trust from the social sciences to human trust in automation; Bass and 
Pritchett (2008) modified social judgment theory to quantitatively model human 
interaction with automated judges; Pritchett (2005) proposed framing human 
interaction with alerting systems in terms of the same type of “role” descriptions 
used within human teams; and Sarter and Woods (2000) explicitly described flight 
path automation as a team member (albeit often a poor one!). More explicitly, 
Woods (1985) and Woods and Hollnagel (2006) suggested that "good" automation 
should create a diverse joint human-machine cognitive system. Likewise, 
Miller and Parasuraman’s playbook metaphor compares function allocation to 
automation with delegation in human teams (2007). Thus, comparing the design 
of human teams with teams of humans and automated systems is well-grounded in 
the human-automation research community. 


Relevant Insights from Team and Organization Design 

Within the broad range of definition of team provided in the literature, Salas, 
Dickinson, Converse, and Tannenbaum (1992) provide a generally accepted 
definition of teams as a collection of (two or more) individuals working together 
inter-dependently to achieve a common goal. Such teams may range from being 
highly structured and interdependent to those whose members interact infrequently 
and instead execute the team’s work from a shared group context. Our crude 
engineering focus describes these behaviors as teamwork functions, but they may 
also include intricate social and cognitive aspects. 

Organizational or team design seeks to specify both the structure and the 
strategy of the team or organization, including who owns resources, who takes 
actions, who uses information, who coordinates with whom, the tasks about which 
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they coordinate, who communicates with whom, who is responsible for what, and 
who shall provide backup to whom (Szilagyi & Wallace, 1990). Methods such as 
computational organization theory have established quantitative models that are 
analyzed for known issues via formal models and simulation. These models apply 
to fully trained organizations facing tasks within their operating procedures and 
knowledge base; other important factors such as team response to high workload 
and time pressure are not as well captured (see Schraagen & Rasker, 2003 for a 
more expansive discussion of the limitations of these approaches). Also relevant 
here is the social organization and cooperation analysis component of Cognitive 
Work Analysis, in which at least two aspects of team design are examined: form of 
cooperation, and content (criteria governing the allocation of responsibilities and 
roles among actors) (Vicente, 2003; see also Naikar et al., 2003). 

Conflicting residts are found in the literature when examining specific features 
such as, for example, particular forms of flat versus hierarchical teams or the 
attributes of good team communication (see review in Schraagen & Rasker, 2003). 
Thus, the work environment is also increasingly recognized as a determinant of 
team performance. For example, Schraagen and Rasker (2003, p. 780) noted, 
“Seemingly disparate results may be reconciled by taking into account whether 
team members have to deal with unanticipated disturbances in the environment or 
not.” To this end, the field of organization design has moved to an open-systems 
viewpoint that emphasizes the characteristics of the environment and that views 
a team as a complex set of dynamically intertwined and interconnected elements 
that are highly responsive to their environment. 


Relevant Aspects of Human Behavior within Teams 

Several specific aspects of human team behavior are also relevant to human- 
automation function allocation. The first is trust, defined in studies of human 
teams as “a willingness to rely on another party and to take action in circumstances 
where such action makes one vulnerable to the other party” (Doney, Cannon 
& Mullen, 1998), although some distinguish between belief in the other party 
(trust) and concomitant actions (reliance) (for example, Lee & See, 2004). Trust 
is also flexible, adjusting to team members’ apparent capabilities in changing 
circumstances. By these definitions, trust is central to the actual establishment 
of responsibility within a team (as compared to a normative designation); a team 
member may be willing (or unwilling) to rely on another agent to autonomously 
execute a function which contributes to outcomes for which they are responsible. 

The second relevant construct is the shared mental model, with similar 
constructs commonly called team knowledge, shared situation awareness, and 
complementary mental model (Cooke, Salas, Cannon-Bowers & Stout, 2000; 
Endsley, 1995; Sperling & Pritchett, 2006). Amongst many definitions, Sperling 
and Pritchett (2006) defined a complementary mental model as the condition in 
which: (1) each team member has the knowledge necessary to conduct his/her 
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tasks; (2) each team member knows which information is known by the other team 
member should he/she need to seek it; and (3) each team member knows which 
information is needed from them to other team members and when. 

Teams with members that share similar knowledge structures regarding the 
task, the environment, equipment, member capabilities, and member interactions 
communicate more effectively and perform better than teams whose members do 
not share such knowledge, especially in high workload environments (for example, 
Cannon-Bowers, Salas & Converse, 1993; Klimoske & Mohammed, 1994; Rouse, 
Cannon-Bowers & Salas, 1992; Stout, Cannon-Bowers, Salas & Milanovich, 
1999). However, the development of shared mental models is not guaranteed, and 
is driven in part by team structure and by information distribution within the team. 

The third relevant construct addresses communication. This is tightly coupled 
with shared mental models, which replaces constant communication with the use 
of an evolving knowledge structure informed principally about changes (Cannon- 
Bowers et al., 1993; Rouse et ah, 1992). Communication is also the means by 
which teams coordinate resources and activities (Entin & Serfaty, 1999). While 
good communication can be vital, bad communication patterns can instead be 
disruptive. In taxonomies such as that proposed by Entin and Entin (2001), a key 
measure of good communication is team members' anticipation in communicating 
information or transferring actions at a point where it is desired yet before the 
recipient overtly requests it. Therefore, the combination of a team structure and 
shared mental model that enables team members to predict each others’ information 
needs and provide information at useful, non-interruptive times is important for 
communication patterns that will improve performance (for example, Hutchins, 
1992). 

The fourth relevant construct is adaptation in cognitive behavior due to context. 
Studies suggest that humans can prioritize and select strategies for their cognitive 
activities to match their circumstances, including their immediate demands (for 
example, perceived time available) and resources (for example, information 
availability, relevant skills). This activity has been termed cognitive control and 
control strategies are described as Cognitive Control Modes within Hollnagel’s 
Cognitive Control Model (Hollnagel, 1993). These adaptations in behavior extend 
to humans’ strategies for interaction within a team. 


Comparing Team Design to Human-Automation Function Allocation 

In contrast to the preceding discussion of human teams, current function allocations 
between humans and automation are often driven by the machine’s technical 
capabilities. For example, once lined up on the runway, current technology in air 
transport aircraft require pilots only to enable the autopilot, raise the flaps and 
gear after takeoff and, some hours later, lower the flaps and gear before landing 
and then taxi in after an autoland. However, these same systems can only operate 
in nominal conditions (leaving the pilots to detect and respond to the off-nominal) 
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and they do not interact with the ATC Controller (leaving the pilots to re-format 
ATC commands for data entry into the FMS). Thus, many aspects of the humans’ 
assigned functions (for example, monitoring and re-formatting of data) do not work 
to their strengths and create new mechanisms for error (for example, Bainbridge, 
1983; Sarter & Woods, 2000; Wiener & Curry, 1980). 

Technology-focused categorizations, such as level of automation (for example, 
Billings, 1997; Parasuraman et al., 2000; Sheridan, 1992), are limited for the 
purpose of allocating functions within human-automation teams. For example, a 
high level of automation may automatically execute actions, yet if it is consistently 
monitored and overridden by a human it contributes little; conversely, a low level 
of automation may only suggest a course of action yet, if a human does not have 
the capability to assess its validity, then the human may be cognitively railroaded 
into following its output exactly (Pritchett, 2005). Even when used as intended, 
descriptions of automation’s functioning via categorizations such as levels of 
automation do not delineate the humans’ involvement in and responsibility for the 
associated work (for example, Feigh & Pritchett, 2014). Further, like attempts in 
team design to identify the best team structure, categorizations such as levels of 
automation will find conflicting results due to the impact of organizational context 
and the immediate demands of the work environment. 

Related research has identified the need to additionally anticipate how 
automation changes humans’ conduct of the functions still remaining to them— 
and how actual human behavior may not mirror that anticipated during design (for 
example, Bainbridge, 1983; Parasuraman & Riley, 1997; Sheridan, 1992; Weiner & 
Curry, 1980; Woods, 1985). Traditional approaches attempted to incorporate such 
concerns via measures of human performance; however, given the extent to which 
human performance is driven by the work context, static or de-contextualized 
measures of human performance are not sufficient. Pritchett (2005, p. 57), for 
example, described pilots’ use of flight deck alerting systems as an “opportunistic 
response to the full demands of the environment at the time of the alert,” noting 
that interactions between humans and automation are a response to their collective 
work environment. 

To respond to changing context and environment, some function allocations 
may be dynamic, that is, the automation may be adaptive (changing its function 
allocation automatically) or adaptable (allowing its human operator to change 
its function allocation) (Kaber, Wright, Prinzel & Clamann, 2005; Miller & 
Parasuraman, 2007; Scerbo, 1996). Beyond the just-noted function allocation 
strategies for static function allocations, dynamic function allocation for adaptive 
automation may be based on the actual or predicted performance of the human, 
psychophysiological assessment, or in response to events pre-identified as 
warranting a change. While some current automated systems may technically fit 
within these definitions (for example, a relief valve that clicks on when pressure 
drops too low, or an alerting system that detects some critical event), research 
into adaptive automation is examining more complex function allocations 
(Feigh, Dorneich & Hayes, 2012). Likewise, research into adaptable automation 


Human-centered Automation as Effective Work Design 


165 


is examining sometimes-intricate function allocations that can be described 
coherently (for example, the playbooks described by Miller & Parasuraman, 2007) 
or that mirror changes in human cognitive control (for example, Pritchett, Kim & 
Feigh, 2014a, 2014b). 

Such adaptation in human-automation function allocation does not correspond 
well to adaptation in human teams, due in large part to automation's lack of "inter¬ 
personal skills" (at least at this time). Further, automation cannot learn these skills 
and adjust its behavior toward the collective needs of the team during the team¬ 
building exercises common to training in many safety-critical domains such as 
aviation (for example, Salas, Wilson, Burke, Wightman & Flowse, 2006). 


Comparing Team Design to Mapping Rules for Team Task Allocation 

Comparing human-automation function allocation with the six mapping rules for 
team task allocation proposed by Rasmussen, Pejtersen and Goodstein (1994), the 
first, actor competency, notes that the heterogeneous demands of the work should 
be allocated according to actor capability. The relative capabilities of humans and 
automation have long been used as a basis for function allocation, as formalized in 
the famous (some might say infamous) “Men Are Better At/Machines Are Better 
At” list (Fitts, 1951). While the authors of this list recognized several caveats in its 
application, subsequent designers often found it all-too seductive to use as the sole 
or primary justification for function allocation; resulting function allocations can 
be piece-meal and incoherent. At a more philosophic level, sole reliance on such 
lists frames the humans’ contribution to the team as fundamentally flawed and 
thus needing automated coverage of human limitations, without building on their 
strengths and without considering team dynamics. While many have pointed to 
problems resulting from such function allocations to decry such lists (for example, 
Dekker & Woods, 2002), we argue, based on the team and organization design 
literature, that a candid assessment of human and technological capability is not 
fundamentally flawed: rather, it addresses only one factor in the wide range of 
factors demanding consideration. 

More detailed examination is also often warranted of human and automation 
competency. For the human, for example, the required competency may need to 
support behavior that is strategic, rule- or procedure-based, or quick and intuitive, 
depending on context (for example, Feigh & Pritchett, 2006). In contrast, 
automation's competency is not defined by whether it can adapt to context: its 
competency is defined by whether it might be placed in conditions in which it 
can’t operate and thus will appear to fail. 

The second rule, access to information and action means , has not typically been 
part of human-automation function allocation methods per se, but instead is often 
addressed in subsequent studies of automation’s effects on the human’s situation 
awareness and decision making. Notably, several studies have highlighted how a 
passive function allocation to the human can result in reduced information seeking 
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and degraded situation awareness (for example, Endsley, 1996), to the point that 
the human’s assuming the workload of monitoring activities are, in some studies, 
considered a positive indicator of performance (for example, Parasuraman, 
Mouloua & Molloy, 1994). Even when the human is actively involved, they can 
be “cognitively railroaded” by a lack of information and resources into accepting 
the automation’s behavior without true oversight, or cases where the automation 
unduly shapes the human’s judgment of the environment or representation of 
decision options, an effect generally known as automation bias (Bass & Pritchett, 
2008; Layton, Smith & McCoy, 1994; Mosier, 1996). 

The third rule, communication and coordination, has several aspects: 

• First, the inclusion of automation often demands new communication and 
coordination amongst human team members. For example, new cockpit 
automation generally necessitates new Crew Resource Management 
strategies, such as the need for explicit cross-check between captain and 
first officer following any entry into the autoflight system. 

• Second, exchange of information and direction of action are generally 
recognized as the purposes of automation interfaces. However, we earlier 
noted that communication allows human team members to reflect on 
thoughts and to construct shared mental models and situational awareness. 
In contrast, automation interfaces only inform a representation of the work 
environment and task, rather than create a shared construct through a 
dialectic; the human does little to construct or modify the machine’s mental 
model, and experiences little of the dialogue that would also contribute to 
their own. 

• Third, high-performing teams tend to communicate useful information in 
meaningful ways yet automation tends to communicate either raw data or 
commands rather than intermediate interpretations; likewise, automation 
often provides this information in formats or representations conflicting 
with the human’s information needs and, often, disparate with their other 
information sources (for example, Sarter & Woods, 2000; Wiener & 
Curry, 1980). 

• Fourth, the timing of human-automation communication is generally 
initiated by the human scanning the interface or the automation announcing 
information. The former requires a level of effort determined in part by 
the usability of the interface; the latter is only useful if the information is 
announced when the human needs it. 

The fourth mapping rule addresses workload sharing. This workload includes 
several aspects: taskwork, the teamwork induced by human-automation 
interaction, and team maintenance activities such as adjusting to new function 
allocations. Many studies have highlighted situations where clumsy automation 
induces additional workload at inappropriate times or to inappropriate levels (for 
example, Billings, 1997; Wiener & Curry, 1980). This induced workload can be 
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prohibitive; Kirlik (1993), for example, demonstrated in a laboratory study how 
high workload associated with human-automation interaction can make it rational 
for the human to perform the work manually. 

The dynamic workload sharing associated with adaptive automation can 
be difficult to establish because human team members will incur workload 
in adapting to each new function allocation. For this reason, Hilburn, Molloy, 
Wong, and Parasuraman (1993) questioned whether allocation for the purpose 
of regulating upcoming workload can be confidently based on current workload, 
performance or events. Miller and Parasuraman (2007) described this difficulty 
as a tradeoff between the workload the automation could seek to assume and the 
unpredictability of the work environment created by frequent transitions; they 
propose allowing the human team members to set the preferred operating point in 
this tradeoff. Likewise, Rasmussen, Pejtersen, and Goodstein (1994) and Vicente 
(1999) advise allowing the operator to "finish the design" in the context of use, 
given the difficulty in predicting these effects during design. 

The fifth mapping rule for team task allocation addresses safety and reliability. 
The discussion so far has discussed safety concerns with whether a function 
allocation can create a workable team dynamic. In addition, safety concerns 
may impact whether a human team member chooses to rely on automation as 
mediated by their trust in the automation and their own self-confidence at the task. 
Reliability concerns also include the teams’ collective response to variations in 
their work environment, including robustness to the off-nominal or unexpected. 
Here, the brittleness of automation noted earlier can be of especial concern: the 
automation may need to be considered a weak link and the team designed around 
it. However, function allocations which rely on extensive, boring monitoring of 
generally-reliable systems, or that rely on the human second-guessing time-critical 
commands from alerting systems and decision aids, have well-documented issues 
(for example, Parasuraman et al., 1994; Parasuraman & Riley, 1997; Pritchett, 
2005). 

Finally, the sixth mapping ride addresses regulation compliance. Certain 
functions may need to be performed by the human team members for regulatory 
reasons, such as when they hold ultimate responsibility for the functions’ outcome. 
In addition, regulations often mediate team dynamics by specifying standard 
operating procedures or work practices. 


Modeling the Work of Human-Automation Teams 

A common theme in team design is understanding their collective taskwork and 
the teamwork induced by a function allocation. Thus, designing the work can be 
the most important concern in effective human-automation interaction. Indeed, 
work design is a multi-disciplinary problem, as specifications of work, such as 
concepts of operation, must intrinsically integrate economic and safety metrics, the 
potential contributions of (or constraints on) technology and human performance, 


168 


Advances in Aviation Psychology 


and the regulatory, policy, and procedural considerations in allowing access to— 
and defining interaction within—the collaborative functioning of the system. 

Thus, the foundation of human-centered automation is laid in the work design. 
At such an early stage, human-in-the-loop (H1TL) evaluation is not possible—the 
training, procedures, and technology are only specified in terms of the functions 
required. Instead, the important construct is: If everything, and everyone, in the 
system performs their functions perfectly, what will emerge? The answer is created 
by the interplay of the work environment (as defined by physics and regulations) and 
the team acting upon this environment. Concepts of operation can be constructed 
poorly when they are sensitive to small variations in how the work is performed, 
or where they assume actions will be performed with impossible speed or detail. 
For example, optimized profile descent concepts of operation must establish work 
activities to regulate the physics of a fuel-efficient descent, while recognizing that 
key variables—the aircraft performance, the wind profile, constraints imposed by 
the traffic stream—are known (or only partly known) at different locations and 
times, yet key decisions to descend earlier or later can have profound impacts on 
the aircraft’s ability to track its descent profile within the traffic flow. 

To ensure the overall specification of work is sound, work activities can be first 
modeled in detail without requiring detailed models of the agents. Conceptually, 
this analysis is best conducted with simple models of human performance to 
identify problems emanating from the feasibility of a concept of operation. Further, 
a concept of operation can be examined for its robustness and resilience: What if 
something doesn't go perfectly? Flere, the system’s response to unexpected events 
can be modeled and simulated. These unexpected events may stem from several 
sources: exogenous inputs to the system (for example, on an air traffic system an 
unexpected tailwind); technology (for example, the failure of a radar system); or 
from human performance (for example, a limit on the number of simultaneous 
activities). The work involved in responding to these events will be emergent and 
dynamic, and a concept of operation can be designed to be more (or less) robust 
and resilient. 

Work can be analyzed in several ways. Approaches such as Contextual Design 
(Beyer & Holtzblatt, 1988) and Cognitive Work Analysis (Roth & Bisantz, 2013) 
provide qualitative and visual presentations of the work that are intended to guide 
and inform designers. Our own recent efforts have established a computational 
framework that enables work to be computationally modeled and simulated early 
in design (Pritchett, 2013), first to support analysis of the concept of operation 
(that is, the taskwork) and then to examine the design of the team itself (that is, 
allocation of functions within the team, and their teamwork). 


Requirements for Effective Function Allocation 

Function allocation distributes work between agents, human and automated, 
within a team. The following requirements are summarized from more extensive 
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discussions of the requirements for effective function allocation, and methods to 
model and measure function allocation (Feigh & Pritchett, 2014; Pritchett, Kim & 
Feigh, 2014a, 2014b). 

Requirement 1: Each agent must be allocated functions that it is capable 
ofperforming 

Every agent in the team must be capable of each of the functions assigned to him/ 
her/it, viewing each function in isolation. In a very coarse sense, such a strategy 
is supported by assessments of what “Men Are Better At” to what “Machines Are 
Better At.” From this perspective, automation can serve to provide functions that a 
human cannot perform at all or with sufficient reliability. However, the automation 
must not be brittle. Thus, a prediction of whether the automation will be placed 
outside its boundary conditions is itself a valuable metric that implies potential 
concerns with the resilient performance of the team. 

A further consideration in creating effective human-automation interaction 
examines responsibility and authority. Except when automation is proven to 
provide safety in all foreseeable operating conditions, humans remain vested with 
the responsibility for the outcome of automation’s actions, a situation termed 
the “responsibility-authority double-bind” (Woods, 1985). If the human cannot 
knowledgably oversee the automation, they are forced to rely on it. However, 
without a concrete basis for assessing if the automation is correct, humans often 
over- and under-trust the automation (Parasuraman & Riley, 1997); either way, 
incorrect trust is viewed as human error, despite its basis in the function allocation. 
Thus, identification of mismatches between responsibility> and authority is itself a 
valuable metric that implies potential concerns with trust and reliance. 

Requirement 2: Each agent must be capable of performing its collective set 
of functions 

The metric for success here is whether each agent can perform his/her/its 
collective set of functions under realistic operating conditions. Thus, prediction 
of the taskload placed on the human operators — or, where possible, workload—is 
a valuable metric. To fully address known issues with taskload corresponding to 
function allocation, such assessments must consider the full range of activities 
required, including underlying cognitive activities around information gathering 
and judgment, and requirements to monitor automation, in addition to explicit 
manual activities. Further, metrics should consider not only aggregate or average 
workload, but also workload spikes and periods of complacency. 

Further, human-centered automation requires coherent roles. One attribute of 
a coherent function allocation can be viewed from the bottom up—each agent's 
functions share (and build upon) obvious, common constructs, such as a shared 
information and knowledge basis, and the allocation prevents conflicts between 
the actions of different agents. Another attribute can be viewed from the top 
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down—the functions collectively contribute toward work goals in a manner that 
is not only apparent to the human, but that can be purposefully coordinated and 
adapted in response to context. Thus, the coherence of the function allocation is 
intrinsically an important construct warranting its own analysis. 

Requirement 3: The function allocation must be realizable with 
reasonable teamwork 

Each different function allocation demands its own unique set of teamwork 
functions, including human-automation interaction and human-human 
coordination. The impact of this teamwork must then be considered from the 
perspective of the previous two concepts—can each agent perform each of his/ 
her/its teamwork activities in isolation, and can each agent perform its assigned set 
of both taskwork and teamwork functions? 

Members of good teams are able to anticipate each other’s information needs 
and provide information at useful, non-interruptive times. However, too often 
automation is clumsy: it unduly interrupts its human team members because it 
cannot sense whether other team members woidd benefit from an interruption. 
Thus, the potential for a function allocation to cause agents to interrupt each other 
is an important construct to be analyzed. In some cases, such as poorly-timed 
output from automation, such interruptions may be unwarranted; in other cases, 
different function allocations may require agents to interrupt each other because of 
how their functions are interleaved. 

Requirement 4: The function allocation must support the dynamics of the work 

Analysis of a function allocation should identify situations where, for example, 
the interleaving of functions assigned to disparate agents requires significant 
coordination or idling as one waits on another, or where workload may accumulate, 
or where one agent will be unduly interrupting another, or where executing 
prescribed procedures may conflict with other work demands, or where automation 
may be placed outside its boundary conditions. These issues were discussed in the 
preceding sections, but are repeated here to note their dynamic nature. 

Further, resilience is fostered when a human agent may select strategies 
(courses of action) appropriate to the state of the environment and their own 
capabilities. The ability of human team members to adapt to context reflects 
the ability to balance the demands on them and the resources available to them 
in terms of information, knowledge, and time available (Feigh & Pritchett, 
2006). However, such adaptation can be constrained by an overly prescribed (or 
proscribed) function allocation, particularly where human-automation interaction 
dictates a sequence of activities from the human. Such overly prescribed function 
allocations manifest in work-arounds or dis-use of automation (Feigh & Pritchett, 
2010; Parasuraman & Riley, 1997). Thus, the ability to which a function allocation 
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can accommodate a reasonable variety of hitman adaptations to context should 
also be analyzed and fostered. 

Likewise, human-centered automation should foster the humans’ ability to 
maintain a stable work environment. A function allocation may aggravate inherent 
environmental unpredictability by, for example, limiting human agents’ ability 
to view important aspects of the environment or by distributing functions in a 
way such that one agent will trigger the requirement for another to act. Indeed, a 
tradeoff exists between maintaining predictability versus dynamically allocating 
functions (Miller & Parasuraman, 2007). Thus, humans ’ ability to predict their 
activities has intrinsic value and should be fostered. 

Requirement 5: The function allocation should be the result of deliberate 
design decisions 

Changes in operational concepts may be incremental and constrained by current- 
day technologies, procedures, personnel and/or policies; in other cases, changes 
in concepts of operation may represent significant innovations in common work 
practices and relationships between tasks and tools. Either way, designers need to 
simultaneously consider the economic and safety metrics by which the total system 
will be evaluated, the potential contributions of (or constraints on) technology 
and human performance, and regulatory, policy and procedural considerations. 
Thus, the design of human-centered automation should consider not only each 
agent s experience, but also simultaneously consider the cost and performance of 
the combined efforts of the human-automated team. 


Conclusion: Perfect Human-centered Automation is Impossible 

In an earlier study we examined four function allocations using computational 
simulations of work, ranging from full autoflight with datalink (FA1) through 
progressively less automated conditions to pilot control of the trajectory by setting 
autopilot targets (FA4) (see Pritchett, Kim & Feigh, 2014a, 2014b). In these 
simulations we also assumed that the human agent (the pilot in this case) might 
exhibit three different behaviors, as represented by the Opportunistic, Tactical 
and Strategic cognitive control modes (CCM). Figure 10.1 reflects a subset of the 
metrics for the “most-automated” FA1 and the “least-automated” FA4, normalized 
such that 100 percent represents the ideal: perfect human-centered automation 
would have 100 percent on each of these metrics. Instead, each function allocation 
scores higher on some metrics and lower on others. The more automated function 
allocations required better (less) interaction with the pilot but were less predictable 
to the pilot, made for a lower coherency role for the pilot and interrupted the pilot 
more. The less automated function allocations provided a more coherent role for 
the pilot and more predictability, at the expense of requiring them to do more of 
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the work. Further, all of the function allocations assumed the pilot would perform 
monitoring activities that we predict the pilot would shed in the opportunistic and 
tactical CCM. 

In the end, all of the function allocations met the mission goals in this case. 
This reflects a situation common in aviation—the agents can adapt and respond to 
the environment to get things done. The challenge in designing human-centered 
automation is identifying how to design the work—the concept of operation and the 
function allocation within it—that strikes the right balance between key tradeoffs 
inherent to divvying up the work to reduce workload, yet maintain coherency and 
predictability, and reduce interruptions. 
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Enhancing Military Helicopter Pilot 
Assistant Systems Through Resource 
Adaptive Dialogue Management 

Felix Maiwald & Axel Schulte 

Universitat der Bundeswehr Miinchen, Germany 


Introduction 

The Institute of Flight Systems (IFS) oversees a 20-year research agenda on 
knowledge-based assistant systems in vehicle guidance work processes that 
provide an alternative way of human-automation co-action. Such assistant 
systems have proven their potential to overcome human factors-related problems 
associated with more traditional purely technology driven automation philosophies 
(Prevot, Gerlach, Ruckdeschel, Wittig & Onken, 1995; Schulte & Stiitz, 1998). 
This approach is denoted by the term of Dual-Mode Cognitive Automation 
(DMCA), which will be explained in more detail below. Automation derived 
from the proposed design principles represent a cognitive system engineering 
approach to handle workload-induced problems by providing situation adapted 
hints, warnings, and decision-aids while actively keeping the human operator in 
the loop. 

Traditional automation systems in aeronautics almost exclusively adopt 
sensory-motor tasks in human factors engineering; that is, the gathering and 
processing of continuous signals as well as continuous manipulation (for example, 
sensory systems for gathering the flight status, autopilot, and FMS). They are 
considered necessary to keep the human operator freed from a number of continuous 
tasks which otherwise could lead to overload. The human tasks remaining are 
well described in general by the notion of supervisory control (Sheridan, 1992), 
involving the human perception and comprehension of the situation, the planning 
of the course of actions, the decision making to select the appropriate means, and 
any kind of problem solving related to the mission. In general terms, these are 
higher cognitive tasks, which are to be processed sequentially by humans and 
inflict a heavy demand on mental processing resources (cf., for example, Wickens 
& Hollands, 2000). 

The concept of having automated functions in the work process as low- 
authority technical specialists supporting a particular operation but lacking the full 
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perspective, was very convenient as long as only simple automated functions were 
used. It was good that the responsibility to meet the prime goals of the work process 
was exclusively on the human operator’s shoulders. This might become critical, 
however, since with increasing conventional automation complexity, the task of 
the operator as supervisor becomes more and more complex, as well. Instead of 
the intended unloading, danger of over-burdening the supervisory operator may 
inevitably arise in certain situations, resulting in a loss of situation awareness and 
control performance (that is, failing to meet the prime goals). The operator might 
be unaware of discrepancies between activities of automated functions and the 
prime goal necessities. 

Especially in so-called “autonomous” systems, designers more and more tend 
to transfer the authority from the human operator to the increasingly complex 
automated functions. Although, being relieved from workload in the first place, 
the human operator will be burdened with extra demands from increasing system 
monitoring and management tasks. At a certain point the human operator fails in 
the supervision of such complex automated systems. 

These deficiencies of conventional automation are well known to the scientific 
community (Billings, 1997; Sarter, Woods & Billings 1997). To counteract 
these drawbacks, Billings (1997) formulated his principles of human-centered 
automation. These recommendations, as well as other considerations, led us to 
introduce a new kind of automation into the work process (Onken & Schulte, 
2010). Complex automation shall no longer be under error prone and demanding 
supervisory control of the human, but rather it will work in a close partnership with 
the human operator in the form of an artificial cognitive agent. To achieve this we 
address the research question of how to design such cooperative assistant systems. 
Furthermore, these assistant systems shall be adaptive to the state of the mental 
resources of the human operator. As an application example we chose a military 
helicopter pilot assistant system in a high workload mission context involving 
Unmanned Aerial Vehicles (UAVs) controlled by the helicopter’s cockpit crew 
(that is, manned-unmanned teaming missions (MUMT, Strenzke, Uhrmann, 
Benzler, Maiwald, Rauschert & Schulte, 2011). 

The following section will provide a general overview of the overarching 
approach to the design of cognitive and cooperative automation, before we provide 
details on the resource adaptive dialogue management. 


Dual-Mode Cognitive Automation (DMCA) 

Here we will introduce the approach of DMCA (Onken & Schulte, 2010) to 
the human-centered automation of vehicle guidance and control work systems. 
First, we will briefly look at the aspect of cognitive automation before focusing 
on the dual-mode-concept. In essence, DMCA means that an artificial cognitive 
agent can either be delegated tasks or it can work in cooperation with the human 
user. Most important to the following explanations is the fact that we will solely 
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consider automation operated by the human user in the sense of monitoring tasks 
in supervisory control. Automation working as an inherent part or function of the 
system, but not explicitly noticed by the human operator (for example, stability 
augmentation systems, governors) will not be considered. 

Cognitive automation concepts 

Cognitive automation denotes an automation design, which, as opposed to 
conventional automation, incorporates architectures, information-processing 
methods, and algorithms implementing higher cognitive capabilities, such as 
planning, problem solving, or decision making as part of the machine functions. 
That automation is able to adopt higher cognitive tasks from the human, whenever 
necessary or appropriate. In avionics systems development we can only find 
somewhat more conservative approaches so far, which is primarily due to 
certification issues (Jarasch & Schulte, 2008). 

In order to systematically derive the aforementioned notions of higher cognitive 
capabilities, a three-layered model of human cognitive sub-functions, based upon 
Jens Rasmussen’s (1983) classical SRK-taxonomy (skills, rules, knowledge) was 
developed (Onken & Schulte, 2010). For cognitive automation the consideration 
of the highest behavior level, the concept-based layer ( knowledge-based behavior 
according to Rasmussen, 1983), is of particular interest. Only capabilities located 
here enable humans to make rational decisions when encountering unforeseen or 
unknown situations based upon the interpretation of the situation, consideration 
of abstract goals for action, and the planning of the course of action. This enables 
the human to act in a flexible way, unlike conventional and rigid automation. 
Conventionally automated systems always act within the solution space, created 
during design. In unforeseen situations automation may inadequately interact with 
the user ( brittleness ). These systems will exhibit stereotypical behavior patterns 
upon user input, which can be inadequate for the current situation ( literalism ). 
And, as described by Billings (1997), machine behaviors may not be visible or 
understood by the user ( opacityj. 

Cognitive automation, in principle, is a promising approach to mitigate these 
problems (Schulte & Meitinger, 2010), although this is by no means guaranteed. 
Potential methods to implement cognitive automation are Artificial Intelligence 
(AI) algorithms (Russell & Norvig, 2003), soft computing or machine learning 
approaches (Mitchell, 1997), intelligent (multi-)agent systems (Wooldridge, 
2009), and/or based upon cognitive architectures, such as Soar (Laird, Newell 
& Rosenbloom 1987). The IFS already develops second-generation cognitive 
system architectures (Putzer & Onken, 2003; Briiggenwirth & Schulte, 2012), in 
order to provide accessible Al-algorithms to application developers. Today, these 
cognitive system architectures are the underlying frameworks to most of the ACUs 
(Artificial Cognitive Units), developed at the IFS (Rauschert & Schulte, 2012; 
Uhrmann & Schulte, 2012). 
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“ Dual-mode ” automation concepts 

According to the dual-mode concept of cognitive automation (Onken & Schulte, 
2010), said ACUs shall be integrated as additional components into conventionally 
automated systems in two distinct styles, either in a delegation-style relationship 
with the human operator (mode 1, right robot head in Figure 11.1) or in co-agency 
with the human operator (mode 2, left robot head in Figure 11.1). 



Figure 11.1 Introduction of Artificial Cognitive Units (ACUs into vehicle 
guidance and control system (SC = supervisory control, CC = 
cooperative control) 


Detailed specifications within a delegation-style relationship between the 
human and the ACUs can be spread over a wide range. One end of the spectrum 
is given by the so-called “(fully) autonomous systems.” With respect to the design 
of human-machine systems the notion of technical autonomy is insufficient, 
ill-defined, and potentially misleading (Onken & Schulte, 2010), because the 
necessary user interaction in the process of delegation and supervision is often not 
considered adequately. At the other end of the spectrum we can identify system 
approaches that explicitly address the delegation relationship to the human user 
(Miller, Funk, Dorneich & Whitlow, 2002; Miller & Parasuraman, 2007). The 
second mode of cognitive automation is represented by the ACUs working in co¬ 
agency with the human user. We call this operational concept cooperative control. 
Onken & Schulte (2010) connect this mode with the notion of (knowledge-based) 
assistant systems, a term which was coined by the authors in the late 1980s. A 
series of prototypes were developed and successfully field-tested at the University 
of the Bundeswehr Munich (UBM). The most important ones were CASSY 
(Cockpit Assistant System, Prevot et al., 1995) and CAMA (Crew Assistant 
Military Aircraft, Walsdorf & Onken, 1999). Various international efforts such as 
the Pilot’s Associate (Banks & Lizza, 1991), the Rotorcraft Pilot Associate (Miller, 
Guerlain & Hannen, 1999) and the CogPit (Taylor et al., 2001) also made valuable 
contributions to the formation of a theory (for example, Onken & Schulte, 2010). 
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The starting point of the specification of such an assistant system is the work 
system (Figure 11.2), which is the physical instance of a work process defined by a 
work objective. Within the work system the assistant system on a conceptual level 
belongs to the operating force, as the human user does, and is no longer part of the 
operation-supporting means. 


Environmental 
Conditions 
& supply 



I_I 


Figure 11.2 Work system with assistant system 

The analysis of the human operator’s characteristics being part of the operating 
force leads us to the specification of the assistant system. These can be directly 
rephrased as requirements for the assistant system. The human operator (and 
therefore the assistant system) shall: 

• know and understand the work objective and pursue it by its own initiative; 

• understand the situation (including the cognitive state of the operator, with 
respect to intentions, actions, attention allocation, workload, resources, and 
so on); 

• select the tasks and means in order to achieve the work objective; and 

• know about and deploy operation-supporting means efficiently. 

Furthermore it is required that the individual units within the operating force 
cooperate (that is, human user and assistant system). From this central requirement 
the term cooperative control was coined. The behavioral rules within this cooperation 
are described by the basic requirements of human-machine cooperation (Onken, 
1994; Onken, 2002; Onken & Schulte, 2010). According to these: 

• First of all, the human operator shall pursue his or her work by use of 
the given operation-supporting means for all normal situations foreseeable 
during design time. 
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• Concurrently, the assistant system shall generate an understanding of the 
situation, and guide the operator’s attention to the most urgent task, if 
necessary (first basic requirement). 

• If the operator is overtaxed by task performance, the assistant system shall 
take measures to transfer the task situation into one which can be handled 
by the user in a normal manner (second basic requirement). 

• Finally, if there are tasks, which cannot be executed by the human operator 
and are too important to neglect, the assistant system shall take over these 
tasks or assign them to appropriate operation-supporting means (third 
basic requirement). 

The authority to accept, manipulate, or even define the work objective, however, 
shall always remain with the human being. According to Onken and Schulte 
(2010), this implies the only consistent definition of autonomy in the realm of 
technical systems, which for ethical and other reasons shall never be granted to a 
machine. 

Adaptive automation concepts 

As we can derive from the aforementioned basic requirements, the assistant 
system shall obtain knowledge and understanding of particular cognitive states 
of the human user (for example, tasks currently worked on, attention allocation, 
actions, workload, excessive demands, mental resources), in order to control its 
own intervention policy. Such concepts can be found under the term of adaptive 
automation (Kaber, Riley, Tan & Endsley, 2001; Inagaki, 2003; Scerbo, 2007). 
Adaptive automation denotes a human-automation interaction concept, in which 
the automatic functions are adjusted (for example, with respect to their level of 
automation, Sheridan, 1992) to the available human mental resources or subjective 
workload. One critical aspect of adaptive automation is the online measurement or 
determination of the operator’s mental state. 

Schulte and Donath (2011) studied the use of assistant systems to alleviate 
excessive mental demands in multi-UAV guidance. In this study pilots had to 
manage up to three UAVs in a MUM-T scenario using conventional automation 
(that is, guidance with a FMS by specifying waypoints). Laboratory experiments 
were conducted in a flight simulator. Increasing the task load, and thereby the 
subjective workload, caused the test subjects to exhibit self-adaptive strategies 
(SAS) (Sperandio, 1978). Figure 11.3 shows the workload region in which SAS are 
likely to occur. With the expenditure of extra mental effort acceptable performance 
can be further maintained in this region. 

In Schulte and Donath (2011), respective SAS were observed by use of manual 
interaction protocols and gaze measurement. Donath (2012) provides Hidden- 
Markov-Models (FIMM) of the observed manual interaction sequences, in which 
the hidden states represent the sub-tasks of the indicator task chosen. The method 
developed paved the ground for machine recognition of human task performance. 
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Figure 11.3 Human performance depending on subjective workload can be 
extended by effort using self-adaptive strategies 

In case of the assistant system detecting task performance deviations or even 
known SAS, high workload states of the operator will be able to be detected 
from behavior observation. The individual HMMs, as far as the concept of excess 
workload detection is concerned, are associated to specific workload states of the 
user. 

In the following section we address the design of such a cooperative assistant 
system, the Military Rotorcraft Associate (MiRA), for the helicopter pilot in a 
MUM-T-mission. Here we make use of a cognitive task model, human operator 
observation, and human mental resource prediction approaches. 


Resource Adaptive Assistant Systems 

MiRA has been designed along the lines of the DMCA approach (Onken & Schulte, 
2010). In this section we discuss the general architecture of the assistant system, 
the effects of interventions by the system on human operator workload, introduce 
our concept to enhance MiRA by resource adaptive dialogue management, and 
provide information on the integration in our research flight simulator. 

General architecture of MiRA 

The specification of the MiRA functions that the human pilot interacts with follow 
a layered approach. On the most abstract layer the decision was made to design 
MiRA as an assistant system working in cooperative control (that is, mode 2 of 
DMCA). This decision implies that MiRA will become a part of the operating 
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force in a work system to achieve a military transport helicopter mission work 
objective. 

In order to create the required understanding of the work objective a mission 
planner (Strenzke et ah, 2011) and a task model is an integral part of the design of 
MiRA. While the mission planner generates a complete task agenda over the whole 
mission, the task model breaks down this agenda into individual tasks (Ruckdeschel 
& Onken, 1994). Both steps require the availability of mission-specific knowledge 
being implemented in the system. While the mission plan is kept updated as a 
reaction to global mission changes, the task model uses interaction monitoring and 
gaze tracking to synchronize with the actually performed tasks. Overarching goals 
as part of a goal model are used to analyze the course of action with respect to goal 
violations (for example, the violation of situation-specific clearances or system 
settings). In case of such a goal violation MiRA considers intervention. 

When these interventions are to be issued is designed according to the 
aforementioned basic requirements; that is, MiRA shall guide the operators’ 
attention to the most urgent task if necessary. If the workload is too high, MiRA 
shall manipulate the task load, to keep the subjective workload on an appropriate 
level. In those two cases MiRA starts appropriate dialogues by its own initiative. 
However, when initiating dialogues, the assistant system has to consider the current 
mental state and the available resources of the human operator. In the following 
section, this aspect will be explained in some more depth. 

Assistant system interventions and workload 

Following Schulte (2012a) the human operator shall perform tasks as long 
as he or she is able to do so under normal workload conditions. As long as the 
human succeeds without errors or excessive demands, the assistant system would 
not intervene at all (region I in Figure 11.4). The situation may change due to 
unplanned external events, which would trigger a workload increase (region II). 
According to the first and second basic requirement the assistant system would be 
programmed to intervene in order to reduce the workload back to a manageable 
level (region II and finally region IV). 

However, if the human operator has to invest additional cognitive resources to 
recognize the offered support or interact with the system in any way, then due to 
the increased demand of resources, the workload level may increase at first rather 
than declining (region III in Figure 11.4). In extreme cases, the human will not 
be able to provide sufficient free cognitive resources to benefit from the offered 
support optimally. Wiener (1989) ascribes such adverse automation induced 
effects as “clumsy automation.” 

To counteract this problem, we developed an approach which aims at 
minimizing the additional demands on cognitive resources (region III in Figure 
11.4), which must be provided by the pilot to perceive and handle assistant system’s 
support. Therefore, we developed a human resource model along the lines of 
North and Riley (1988) and Wickens (2002), which enables MiRA to estimate the 
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Figure 11.4 Influence of assistant system’s interventions on 
operator workload 


human operator’s workload and the remaining mental capacity for the current task 
situation. The general idea of the resource model is to connect each task with a 
“demand vector” quantifying the mental resource usage for information gathering 
(visual/auditory, spatial/verbal), for information processing (cognitive spatial, 
cognitive verbal), and for response generation (manual, vocal). The total resource 
demands for the moment shall then be computed by superposing the demand 
vectors of the current tasks by use of Wickens’ (2002) “conflict matrix.” This also 
allows the computation of a predictive workload value. In case the assistant system 
is initiating a dialogue, there exist the options to select a code (spatial/analog or 
verbal) and a modality (visual, auditory) for display, each of which is associated 
with a unique demand vector. By overlaying the related demand vectors with 
the current task’s vectors the assistant system chooses the display option, which 
causes the least extra workload and, thereby, minimal conflicts. The best available 
interaction modality is chosen in order to minimize the additional (automation- 
induced) demand on the pilot’s mental resources. 

Concept in detail 

The core elements of our (Maiwald & Schulte, 2011) implemented concept to adapt 
the information transfer (particularly the dialogues) according to the pilot’s current 
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spare cognitive resources are the task model to determine the current tasks of the 
pilot, the resource model to estimate the mental resource consumption, resource 
conflicts, and the mental workload for current tasks, and the determination of the 
best interaction resource. 

Task model 

In the first step, we captured all external influences on the pilot during a military 
transport helicopter mission (that is, the state of the helicopter, the mission 
objective, the current flight and mission phase, as well as environmental 
conditions). The interpretation of the helicopter task agenda leads to a pilot- 
specific task agenda. After aggregating this data into a full situational picture, this 
can be used to determine the current tasks the pilot should be executing. For this 
purpose, we deployed models of mission-typical task situations. These transition 
state networks have been developed based on knowledge acquisition experiments 
with German Army aviators. 

To make these normative task models dynamic, the assumed tasks are 
synchronized with the actual tasks the pilot is executing. Therefore, human- 
machine interactions such as visual information acquisition (that is, gaze tracking) 
as well as manual and verbal interactions, are analyzed. In this context, assumptions 
are made that observing the gazes and the manual and verbal interactions enables 
the assistant system to draw conclusions on the human tasks actually performed. 
Manual interactions that are taken into consideration include the currently 
displayed pages on the displays, pushed buttons, current system settings (for 
example, landing gear), as well as control stick inputs. Visual interactions taken 
into account for this model are provided by an object-related gaze analysis. 

Resource model 

In a further step, the determined actual task(s) are associated with a model of 
human resource consumption. This model is based on Wickens’ Multiple-Resource 
Theory (Wickens & Hollands, 2000) and estimates the required human resources by 
use of eight-dimensional demand vectors (Wickens, 2002). Every demand vector 
symbolizes the demand a single task poses on the human operator expressed in 
terms of information acquisition, information processing, and response. Data were 
gathered through knowledge acquisition experiments, in which German Army 
aviators had to rate individual resource demands that arise during the different 
mission tasks on a five-point Likert-scale. To eliminate subjective influences from 
these models as far as possible, laboratory experiments have been conducted 
(Maiwald, 2013) to better match the predicted resource conflicts within distinct 
task situations with the objectively measured pilots’ performance. Based on these 
experiments, we applied machine learning methods (that is, genetic algorithms) to 
adapt the underlying human resource model to the measured human performance 
(Maiwald & Schulte, 2012). 

Table 11.11.1 explains the demand vectors in detail using the sample tasks 
“Approach to Pick-up Zone” and “Change zoom on map.” To estimate the current 
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Table 11.1 Demand vectors for two different sample tasks 


Task 

Information Acquisition 

Processing 

Reaction 

visual/ 

spatial 

visual/ 

verbal 

auditory/ 

analog 

auditory/ 

verbal 

cognitive/ 

spatial 

cognitive/ 

verbal 

manual 

verbal 

approach 

Pickup-Zone 

3 

2 

2 

0 

2 

2 

3 

0 

change map 
zoom 

1 

0 

0 

0 

2 

0 

1 

0 


resource utilization, a modified Visual-Auditory-Cognitive-Psychomotor (VACP) 
model (Aldrich, Craddock & McCracken, 1984) is used. This enables the assistant 
system to determine the remaining resources of the operator available to assistant 
system interventions. 

In addition, the predictive resource consumption model inherits some metrics 
of task and resource conflicts for estimating current pilots’ workload. For this 
purpose, the demand vectors of current tasks are applied to a modified workload 
index model (W/INDEX, Wickens, 2002) in pairs. The modified metric we 
applied to the W/INDEX model, eliminates any limitation on the number of 
tasks examinable in parallel. When considering n-tasks in parallel, we establish a 
quantity of k pairwise conflict values TKWf (ie {l,.,k}). 


k = 


2* (n — 2)! 


( 1 ) 


These resultin g pairwise conflict values TKW can be summarized as a row 
vector TKW = { TKWj, TKW,,..., TKW k }. The entire estimated workload 
is defined according to the following formula as the geometric sum of the pairwise 
conflict values. 


Workload » Conflict = %j(TKW 1 f + ... + (TKW k ) 2 = J^IKW, 2 (2) 

\ m=l 


Determination of best interaction resource 

To minimize the additional resource demands due to requiring the pilot perceiving 
system-initiated warnings or to interact with the offered support, we consider the 
pilots’ current tasks and the derived level of workload (that is, the utilization of 
resources). First, we overlay the pilots’ current tasks with hypothetical interactions 
(for example, dialogues to be initiated by the assistant system) using four 
different codes and modalities. Then, each of these possible task combinations 
is rated by the modified W/INDEX- and VACP-resource model referring to 
the resulting utilization of resources. Therefore, we regard the following four 
potential interaction channels: visual-spatial (for example, conformal symbols 
on visual displays), visual-verbal (text messages on visual displays), auditory- 
analog (coded auditory sounds), and auditory-verbal (speech synthesis). Finally, 
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we chose the interaction resource in the current situation generating the lowest 
additional workload value to initiate the dialogue with the pilot. 


Prototype implementation 

For implementation of MiRA we developed a modular structure derived from 
the processing blueprint of the cognitive process (Putzer & Onken, 2003). The 
cognitive process comprises six cognitive sub-processes, which will be run through 
continuously. The cognitive sub-processes are referred to as (1) data acquisition 
from the external world, (2) situation interpretation, (3) goal determination (that 
is, situation diagnosis), (4) planning and decision making, (5) scheduling of the 
tasks to be performed, and (6) control and execution of the derived actions. 
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Figure 11.5 


Military Rotorcraft Associate (MiRA) module structure 
including pilot resource model 


The modules of MiRA follow these stages of the cognitive process. As depicted 
in Figure 11.5, this approach enables us to utilize the individual components and 
modules for both the assistant system and the embedded resource model. However, 
the assistant system and the pilot resource model focus on different aspects. 

The functions of MiRA have been designed to act in two distinct roles to 
support the human, that is associative assistance and alerting assistance (Onken 
& Schulte, 2010). Associative assistance means the continuous presentation of 
decision-aids to the human operator without reflecting his or her cognitive state. 













































































Enhancing Military Helicopter Pilot Assistant Systems 


189 


For example, the proposed speed, proposed course, and so on would simply be 
part of the primary flight display. Alerting assistance will only be offered when 
needed. The purpose of this style of assistance is to make sure that in case of 
discrepancies between the results of both MiRA and the pilot, these discrepancies 
will be communicated by actively drawing the operator’s attention toward this 
fact. All advisories are presented in a resource adaptive fashion by deploying 
different codes and modalities in accordance with the aforementioned prototype 
of the resource model. 

However not all possible combinations of codes and modalities are suitable for 
conveying the required hints and warnings through attention guiding dialogues. A 
summary of feasible combinations of codes and modalities is depicted in Table 11.2. 
If the resource model designates a code and modality for the interaction not available 
in the specific case, then the second best available interaction resource is selected. 


Table 11.2 Allocation of Military Rotorcraft Associate’s (MiRA) 
interventions to perceptual resources 


Interventions 

(Use-Cases) 

Options for Information Presentation 

visual-analog 

(VA) 

visual- 
verbal (VV) 

auditive- 
symbol (AS) 

auditive- 
verbal (AV) 

Violation of time constraints 

available 

available 


available 

Violation of safety critical altitude 


available 

available 

available 

Dangerous ground proximity 


available 

available 

available 

Incorrect landing gear setting 


available 

available 

available 

Incorrect transponder setting 


available 

available 

available 

Violation of transit corridor 


available 


available 

Missed communication 


available 


available 

Information on planning status 


available 


available 

Heading announcement for next flight leg 

available 

available 


available 

Speed, heading, altitude announcements 

available 

available 




Experimental Evaluation 

To evaluate the benefits of MiRA, especially the adaptive automation aspects, 
we conducted a human-in the-loop evaluation in spring of 2011. MiRA was 
part of a larger research effort on MUM-T missions. A fully working functional 
prototype, representing all aspects of DMCA, was implemented for the first time 
and evaluated in realistic mission scenarios in a virtual simulation environment 
with German Army aviators as test subjects. 

The experimental system was a generic two-seat side-by-side military transport 
helicopter cockpit simulator. This facility hosts a complex mission simulation 
system that was used for the development and integration of the prototype. A 
comprehensive overview of the activities and results is provided in Strenzke et 
al. (2011). 
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Experimental setup 

Eight German Army aviators with an average age of 37 years (min. 28, max. 51) 
participated as test subjects. Their flying experience ranged from 830h up to 5, lOOh 
with an average of 1,815h. The experiments were conducted using two different 
experimental configurations. In the adaptive configuration MiRA communicated 
with the pilot by adaptive use of speech output, text messages, audio-alerts, or 
symbolic display messages (eight subjects). The non-adaptive configuration was 
further subdivided composing dialogues either via aural speech messages (four 
subjects) or visual text messages (four subjects). Each subject participated in the 
adaptive as well as in the non-adaptive configurations. They were initially briefed 
on the nature on human-machine interfaces. 

The missions lasted about 30 to 45 minutes and had the primary objective 
to transport troops from friendly Pick-up Zone into hostile Drop Zone. It was 
mandatory to use corridors with distinct opening times to transit from friendly to 
enemy territory and back. 

The commander (represented by the pilot not flying, PNF) was entitled to control 
three UAVs, which supported the mission by taking over reconnaissance tasks 
such as exploring the helicopter routes and landing sites. Each mission contained 
a follow-up mission order that was received by the crew upon accomplishment 
of the primary mission (that is, after dropping troops at Drop Zone). The follow¬ 
up mission contained either a second troop transport (within hostile territory) 
or the recovery of a crashed pilot (from hostile to friendly territory). A video 
documentation of the whole experiment is available (Schulte, 2012b). 

For the validation of the MiRA prototype, only aspects relating to the pilot 
are discussed in the following sections. Results concerning the UAV mission 
management by the PNF can be found in Strenzke et al. (2011). In our experiment, 
we instructed the crew to operate below an altitude of 150ft Above Ground Level 
(AGL) in enemy territory. This altitude describes a safety-critical parameter, due 
to increasing exposure to enemy air defense when violating this requirement. 

Experimental results 

We accumulated altitude violations over the violation time as a performance 
parameter. This was computed for both the adaptive and the non-adaptive 
configurations. A t -test was used to compare these two configurations. 

Altitude related violations decreased by more than 50 percent in the adaptive 
configuration compared to the non-adaptive configuration (Figure 11.6). 
Performance improved significantly in our experiments (t(4)=2.17. —0.048, n— 4) 
when transferring information in a resource-adaptive way. 

After each mission questionnaires (that is, NASA-TLX, ratings on different 
configurations of the assistant system and on the overall system evaluation) were 
presented to the test subjects. To ensure inter-individual comparability, all NASA- 
TLX-ratings were normalized due to different utilization of the workload scales. 
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Figure 11.6 Altitude violations in adaptive versus non- 
adaptive configuration 



Figure 11.7 Pilot subjective workload (normalized NASA-TLX) 

As depicted in Figure 11.7, pilots rated the configuration with text messages 
only as their highest subjective workload level (42.6 percent on the average). In 
contrast, the resource-adaptive configuration was appraised with an averaged 
workload level of 30.6 percent. The workload reduction between these two 
configurations was found to be significant using a two-sided f-test (t(32)=2.06, 
/;-(). 047, SD=9.97, «1=12, n2=22). In addition, the workload decreased with weak 
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significance from 38.9 percent in speech-only configuration to 30.6 percent in 
resource adaptive mode (t(32)=1.87, j9=0.07, SD=10.2, «1=12, n2=22). 

Furthermore subjective ratings regarding the specific benefit of the pilot 
assistant system in resource-adaptive mode were taken. A non-significant trend 
(t(14)=1.95, £>=0.07) suggested that pilots perceived the best support in the 
resource-adaptive configuration. Comparing the configurations resource-adaptive 
versus text message only, the pilots felt significantly better supported in resource- 
adaptive mode (t(10)=5.06,/><0.001). A comparison between resource adaptive 
and speech-only mode found no significance. 


Conclusions 

The approaches described in this chapter stem from the field of cognitive system 
ergonomics, where the consideration of the whole human work process leads to 
a reallocation of cognitive tasks between the acting entities; that is, the human 
operators and the artificial cognitive agents. This does not only mean a fixed 
allocation of tasks to humans and automation as according to their individual 
strengths and weaknesses, but an adaptation of the function allocation to the 
current situation. In this context concepts summarized under the term of adaptive 
automation show benefits by increasing the total performance. These principles 
have been comprehensively described by Onken and Schulte (2010) in the Dual- 
Mode Cognitive Design approach. This concept provides guidelines on how 
artificial cognitive agents can be integrated in human-machine systems, and how 
they are to be designed. 

In this chapter we adopted the aforementioned guidelines of DMCA to design 
MiRA, a knowledge-based assistant system for pilot flying in the domain of 
military helicopter missions. In this research, we enhanced MiRA with adaptive 
capabilities. For this purpose, we developed a concept of pilot residual capacity 
estimation as well as an estimation of the current cognitive workload. By the use 
of models considering the current resource allocation, the assistant system was 
able to use the remaining resources of the pilot best to convey system-initiated 
dialogues. This prevented the pilot from being overtaxed, which maximized 
overall system performance. 

Besides the proof of theoretical concepts, the mission simulator environment 
used in the research allowed the demonstration of operational benefits of the 
MiRA and MUM-T systems. The overall evaluation of the adaptive assistant 
system showed that the altitude-related exposure to potential threats could be 
significantly reduced using the resource-adaptive mode. Pilots reported decreased 
workload and felt best supported in the resource-adaptive configuration. Finally, 
MiRA was rated to be a helpful electronic crewmember increasing the mission 
efficiency and safety. Our future work will incorporate trials for a further 
validation of our resource model prototype, in particular concerning the demand 
vectors. Furthermore, we will apply our presented concept in the domain of 
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civilian aircraft, such as emergency helicopter missions. In this context, we intend 
to enhance the model-based task prediction by developing a hybrid approach 
incorporating human behavior models along the lines of the work presented in 
Schulte & Donath (2011). 
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Chapter 12 

Individual Pilot Factors Predict Runway 
Incursion Outcomes 

Kathleen Van Benthem & Chris M. Herdman 
Carleton University, Canada 


Runway incidents are a critical safety issue facing the aviation industry, with 
worldwide costs estimated at a billion US dollars annually (Honeywell Aerospace, 
2009). A runway incursion is an event where a vehicle or object has entered an 
active runway without clearance. Many incursions involve other aircraft, but they 
can also include service vehicles as well as wildlife that can be large or numerous 
enough to pose a risk to aircraft. This chapter examines the concern over incursions 
from the unique perspective of the general aviation pilot encountering a runway 
incursion upon landing. Until such time as runway incursions no longer occur— 
if that were possible—the ability of pilots on approach to detect and avoid an 
incursion remains an important aspect of runway safety. The following discussion 
is guided by the results of research where pilots landing a high-fidelity Cessna 172 
simulator were faced with an approaching aircraft traveling in the wrong direction 
on the active runway. Our research is focused on the relationship of pilot age, 
expertise, and cognition to the outcome of the incursion scenario. 

The Prevalence of Incursion Events 

Despite careful planning, pilots commonly encounter surprise events during flight. 
In one database analysis researchers found that 5 percent of all surprising events 
were caused by other aircraft, with a number of these involving runway incursions 
(Kochan, Breiter & Jentsch, 2004). Each year almost 1,000 incursions are reported 
across the US (Federal Aviation Administration, 2010) and this figure does not 
include incursions occurring at non-towered airports. The National Transportation 
Safety Board (NTSB) in the US has included improving airport surface operations 
safety on its “Most Wanted” list (National Transportation Safety Board, 2012). The 
NTSB highlights runway incursions as one of its most critical surface operations 
issues and calls for pilot training as one measure aimed at reducing the occurrence 
and effects of runway incursions. Similarly, the Transportation Safety Board of 
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Canada (TSB) regularly publishes “Watchlist Fact Sheets” that highlight areas of 
concern in aviation safety (Transportation Safety Board of Canada, 2012). Risk of 
collisions on runways is one of four main aviation safety threats in Canada and has 
been on the “Watchlist” for three years. A call for new or strengthened efforts has 
been made in light of indications that the number of runway incursions in Canada 
has increased since the Watchlist campaign was first launched (Transportation 
Safety Board of Canada, 2012). 

General Aviation (GA) Pilots and Runway Incursion Detection 

Compared to pilots of other ratings, General Aviation (GA) pilots are over¬ 
represented in runway incursion statistics (Federal Aviation Administration, 2010). 
The FAA’s Annual Runway Safety Report (2010) indicated that human error was 
a contributing factor to most runway incursion incidents, with GA pilot deviations 
responsible for 77 percent of pilot-caused incursions at towered airports. When a 
runway incursion is present on the active runway there are reasons why G A pilots 
might be at a disadvantage when compared to their commercial pilot counterparts. 
GA pilots are more likely to fly simpler aircraft with fewer automated traffic- 
warning systems in the cockpit. GA pilots fly primarily at non-towered airports, 
where there may be less “eyes” tracking runway traffic. Many GA instrument¬ 
rated flights, for example, are purported to be single pilot operations (Veillette, 
2009), thus the responsibility for detecting incursions, such as deviating aircraft, 
is placed solely on the approaching pilot. 

Opportunities for assessment of incursion management are not consistent 
across the population of GA pilots. Depending on the type of license held, a pilot 
might only undergo a competency evaluation every two years in accordance with 
Federal Aviation Regulation Section 61.56—Flight review (Federal Aviation 
Administration, 2006). Proficiencies evaluated at the time of the competency 
review vary according to issues highlighted by the pilot or concerns noted by 
the pilot examiner. Detecting and managing runway incursions during landing are 
not explicitly outlined in FAA documentation as a skill to be assessed during a 
proficiency review (Federal Aviation Administration, 2006). Consequently, the 
onus rests with the pilot examiner and pilot under review to consider incursion 
management as part of the proficiency check. In essence, the effectiveness of 
the GA pilot competency review relies in part upon the pilot’s own subjective 
evaluation of piloting skills, thus placing self-rating of cognitive and perceptual 
abilities as central to GA pilot skill maintenance. Review of skills necessary for 
handling surprise events can occur even more sporadically in other countries, such 
as Canada, where biennial proficiency checks are not required for most private 
pilots. Two-year currency requirements in Canada can be met in various ways, 
including self-paced study programs or Transport Canada seminars where capacity 
for detection and avoidance during landing might not be directly addressed. 
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Predictors of Surprise Event Management 

Research regarding the implementation of traffic alert or head-up information 
systems has dealt most directly with runway incursion management, although these 
studies do not typically examine the relationship of individual pilot characteristics 
to performance. On the other hand, studies that do investigate the relationship of 
individual pilot attributes to GA tasks have not looked specifically at managing a 
surprise incursion upon landing. Both lines of research will be examined briefly 
here. 

Incursion Detection Research 

Studies that include surprise incursions during landing frequently have the 
purpose of evaluating the effect of new cockpit technology on conflict avoidance 
(Jones & Prinzel, 2006; Wickens & Long, 1995). The scenario presented by Jones 
and Prinzel (2006) is most relevant to the issue of GA pilots and detection of 
incursions upon landing. An incursion scenario was presented to 16 GA pilots in 
a flight simulation study looking at effectiveness of various visual and auditory 
cockpit alerts for runway traffic. The sample was comprised of four pilots in each 
of four expertise groups (low- and high-time visual flight rules rated pilots and 
low- and high-time instrument-rated pilots). In one scenario, “surprise” traffic 
was holding for departure on the active runway at the same time as pilots were 
attempting a landing. In phase one of the study two pilots never saw the incursion, 
eight pilots noticed the incursion out the window before the cockpit alert, one 
pilot was alerted via a cockpit surface map, and five observed the incursion out 
the window after an alert (two pilots who approached from a wrong heading were 
excluded). In phase two of the research another incursion scenario involved traffic 
at a crossing runway during landing. In the crossing scenario, over 80 percent of 
the pilots with a basic six-instrument cockpit (the electronic research display) did 
not visually detect the incursion and just 11 percent completed the appropriate go 
around response (Jones & Prinzel, 2006). 

Wickens and Long (1995) also introduced a surprise incursion at the end of a 
flight simulation study to examine the effects of conformal versus non-conformal 
symbology in head-up (HUD) and head-down (HDD) displays. Subjects in this 
study included 14 pilots with at least a private pilot rating and the remaining 18 
pilots were rated at the certified flight instructor or commercial (including airline) 
rating or higher. Detection latency for the incursion (a wide-body jetliner) was 
most adversely affected by the HUD. The phenomenon of cognitive tunneling 
was suggested as the mechanism that interfered with detecting the incursion 
stimulus (Wickens & Long, 1995). Cognitive tunneling acts to limit out-the- 
window registration of stimuli due to the focal location of the information on 
the windscreen and the associated locking of attention onto the HUD (Jarmasz, 
Herdman & Johannsdottir, 2005). 
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Individual Pilot Factors and General Aviation (GA) Performance 

Literature noting the association of age, expertise, and cognitive factors with flight 
performance (for nominal and off-nominal events) suggests pilot characteristics 
that are potentially linked to capacity for managing a runway incursion upon 
landing. There is a trend in NTSB database analyses for older GA pilots to have 
a higher accident risk when compared to younger pilots (starting at 35 years of 
age) (Bazargan & Guzhva, 2011; Li, Baker, Qiang, Grabowski & McCarthy, 2005; 
National Transportation Safety Board, 2011). Li et al. (2005) explored various 
risk factors associated with aviation accidents and found that, in comparison to 
the reference group (age 25 to 34 years), pilots aged 65 years and older were 
almost three times as likely to be involved in an accident. Bazargan and Guzhva 
(2011) found that between 1983 and 2002 GA pilots aged 65 years and older 
were significantly more likely to be involved in a crash involving a fatality than 
younger pilots were. Surprisingly, perhaps, pilots with the least amount of flight 
hours were the least likely to be involved in a crash involving a fatality (Bazargan 
& Guzhva, 2011). The possible interaction between pilot age and expertise has 
challenged our understanding of the effects of age on accident risk in GA. The 
trend for increased risk of critical incidents for older pilots is supported by flight 
simulation research (Adamson et al., 2010; Morrow, Leiber & Yesavage, 1990; 
Taylor, O’Hara, Mumenthaler, Rosen & Yesavage, 2005; Van Benthem, Herdman, 
Brown & Barr, 2011; Yesavage et al., 2011). 

Pilot age and cognition as predictors of flight performance 
The issue of age, cognition, and pilot performance has been studied in longitudinal 
and cross-sectional analyses from a large cohort of GA pilots with consistent 
findings that younger pilots tend to significantly outperform older pilots on key 
simulated aviation tasks (Taylor, Kennedy, Noda & Yesavage, 2007; Yesavage, 
Taylor, Mumenthaler, Noda & O’Hara, 1999; Yesavage et al., 2011). Yesavage et 
al. (1999) reported on a sample of 100 pilots (aged 50 to 69 years at baseline) and 
found that overall simulated flight performance was reduced for the older pilots, 
with age predicting 22 percent of the overall variance in traffic avoidance and 
approach scores. In a subsequent analysis (/V=l 18, age range was 40-79 years) 
it was found that four cognitive factors accounted for 45 percent of the variance 
in simulator scores (working memory/processing speed, visual memory, motor 
coordination, and tracking) and that pilot age contributed significant variance 
in addition to the cognitive factors (Taylor, O’Hara, Mumenthaler & Yesavage, 
2000). Taylor et al. (2005; 2007) reported lower performance for older pilots 
when following ATC messages, traffic avoidance, cockpit instrument scanning, 
and approach and landing ability. In particular, age predicted 28 percent of the 
variance in traffic avoidance and approach. 

Coffey, Herdman, Brown and Wade (2007) found that older pilots missed more 
critical events inside and outside the cockpit than younger pilots. In the Coffey et 
al. (2007) study seven younger (M = 24.4 years, SD = 4.1) and seven older (M = 
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65.7 years, SD = 5.4) pilots flew a desktop Cessna 172 simulator and were scored 
on correct identification of critical events. The age-effects associated with change 
detection were one possible explanation for the older pilots missing more events 
both inside and outside the cockpit. 

The cognitive construct of situation awareness has been studied regarding 
its relationship to flight performance. Situation awareness is often described as 
perceiving and integrating relevant stimuli within a meaningful volume of time 
and space and using selected stimuli to build a mental model of the environment 
and project that model into the future (Endsley, 1988, p. 97). So described, the 
centrality of situation awareness in the detection and management of runway 
incursions is obvious. Using Endsley’s Situation Awareness Global Assessment 
Technique (Endsley, 2000) older pilots (M = 59.4 years, SD = 7.6) have been 
shown to have poorer situation awareness when compared to younger pilots ( M 
= 40 years, SD = 7.1) in simulated flight tasks (for a full description see Van 
Benthem et ah, 2011). Identifying an incursion upon approach might be considered 
one example of correctly perceiving important stimuli. In light of the age- and 
cognition-effects noted for other aviation tasks, some requiring quick decision 
making and traffic avoidance, findings suggest that pilot age and cognitive function 
might be predictors of performance managing an unexpected runway incursion 
upon final approach. 

Pilot expertise as a predictor of flight performance 

Higher levels of pilot expertise, as indexed by total flight hours and pilot rating, 
have been associated with better performance for simulated flight tasks. Higher 
levels of total hours flown significantly predicted both general performance (flight 
path deviation) and a go/no-go decision making for a cross-wind landing (Causse, 
Dehais & Pastor, 2011). Causse et al. (2011) found that a correct response on the 
landing decision task for 24 pilots (mean age was 44.3 years, SD =13.6) was also 
predicted by higher performance on a 2-back working memory task and lower 
motor impulsivity. Other studies have consistently found that pilot rating serves 
as the most reliable index of pilot expertise (Taylor et ah, 2007; Yesavage et ah, 
2011). Higher pilot rating has also been associated with better execution of air- 
traffic commands and avoiding potential conflicts with other traffic (Taylor et ah, 
2011). Other reports of this longitudinal study have shown that pilot rating did 
not moderate age-effects for summary flight performance scores (Taylor et ah, 
2005, 2007; Yesavage et ah, 2011). In accord with the literature, higher levels of 
expertise, as indexed by pilot rating or total flight hours, might be associated with 
better runway incursion management. Expertise could also act as a moderator of 
potential effects of age on pilot performance. 

Subjective ratings as predictors of flight performance 

In light of the aviation industry’s reliance upon GA pilot self-assessment, it is 
worthwhile to examine the link between pilot self-rating and flight performance. 
Situation awareness and mental workload are commonly employed cognitive 
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constructs in pilot self-rating (for example, Vidulich, 1988; Vidulich, Crabtree, 
& McCoy, 1993). Subjective indices of cognitive task workload and situation 
awareness as they compare to objective ratings of general performance have had 
limited treatment in the GA literature, and even less so when studying surprise 
events. Nonetheless, support has been found for the relevancy of pilot self-rating 
to performance detecting and avoiding other aircraft. 

Workload indices matched GA flight instructor performance and display 
preference in a study of cockpit traffic display types (Morphew & Wickens, 
1998). The lowest mental demand ratings from the NASA-TLX, (a common 
measure of workload by Hart and Staveland, 1988), were associated with the 
display where pilots spent the least amount of time in conditions of “predicted 
conflict” (Morphew & Wickens, 1998, p. 54). The NASA-TLX ratings have also 
mirrored performance on flight path deviation scores in a study examining the 
effects of cockpit display and symbology type (Takallu, Wong, Bartolone, Hughes 
and Glaab, 2004). The sample was comprised of 18 GA pilots representing three 
levels of expertise (as per pilot rating and total flight hours): low expertise with 
a mean age of 44 years, medium expertise with a mean age of 38 years, and high 
expertise with a mean age of 56 years. Some symbology types displaying less 
guidance information were associated with the poorest actual flight performance. 
This reduced flight performance was congruently reflected in lower self-ratings 
of situation awareness using the Situation Awareness Rating Technique (SART, 
Taylor, 1990). The older more experienced GApilots appeared to be differentially 
more disadvantaged by the guidance symbology that displayed one of the least 
informative predictive vectors and the broadest tunnel to be used for guidance. 


Flight Simulation Study: Examining Predictors of Incursion Management 

The present study is one component of a larger study on GA pilot performance 
and individual pilot factors. The original larger study was comprised of 108 
subjects, all licensed and actively flying pilots. Fifteen subjects did not have data 
for the subjective rating scales and were removed from the present analysis. The 
remaining 93 subjects were aged 19 to 81 years (M— 46.5), held a current medical 
certification, and had flown within the past two years. Pilot expertise groups were 
defined by student, private, private with additional ratings, and advanced (airline 
transport, commercial, and military) ratings. Table 12.1 reports the group mean 
and range for age, total flight hours, and recent pilot-in-command hours for each 
pilot rating group. A one-way ANO VA revealed that age did not differ significantly 
between pilot rating groups (p>. 1). Total flight hours and recent pilot-in-command 
hours increased significantly as pilot rating increased. 

Subjects flew a Cessna 172 non-motion simulator with instruments and controls 
integrated with Microsoft® Flight Simulator X. Three large screens positioned in 
an arc in front of the cockpit provided approximately 120 degrees of horizontal 
and 45 degrees of vertical field of view. All subjects completed a consent form and 
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Table 12.1 Sample characteristics by pilot rating group 


Rating Group (N) 

Mean Age 
(range) 
(years) 

Mean Total Flight 
Hours* (range) 

Mean Pilot-in- 
Command Hours 
Past 12 months* 
(range) 

1. Student(13) 

40.1 (22-59) 

49.2 (5-144) 

6.1(0-20) 

2. Private (35) 

48.8(19-81) 

242.0(56-815) 

17.0 (0-87) 

3. Private (additional ratings) (27) 

48.8 (19-69) 

681.3 (90-3160) 

29.7 (1-99) 

4. Advanced (commercial +) (18) 

43.3 (20-79) 

2035.0(161-8000) 

82.2 (0-308) 


Note: * Denotes significant differences between pilot rating groups, p<.05. 


a demographic and experience questionnaire before receiving detailed instructions 
regarding the simulator tasks. After an orientation and practice phase subjects 
were required to fly three left-hand patterns at an uncontrolled aerodrome in a low- 
difficulty and then three left-hand patterns in a high-difficulty condition. Pattern 
difficulty was manipulated via terrain and traffic volume. In the low-difficulty 
condition pilots flew over flat terrain and interacted with a maximum of two other 
aircraft in the pattern. In the high-difficulty condition the terrain was mountainous 
and the pilot interacted with four computer-generated aircraft that were relevant to 
the pattern. For all patterns, subjects were required to make radio calls providing 
details of their call sign, aircraft type, and location at routine points during the 
circuit. Subjects were asked to mentally note the similar information provided by 
the virtual pilots from relevant computer generated aircraft. The incursion scenario 
occurred at the end of the high-difficulty condition. 

Simulated runway incursion outcome measure 

During the high-difficulty condition the subjects were requested to fly touch- 
and-go patterns. While on final approach a “rogue” simulated aircraft (incursion) 
was introduced at the distal end of the runway, traveling in the wrong direction 
(that is, toward the oncoming pilot). Subjects were given no warning that this 
scenario might occur during the experimental session. The incursion scenario was 
introduced once, and at the end of the testing session to ensure it being a truly 
surprise and unexpected event. To manage the runway incursion subjects were 
to accomplish three key tasks in quick succession. The initial requirement was 
to correctly perceive and interpret the environmental stimuli. The second task 
was to use the interpreted information to make a decision regarding the required 
procedure to avoid the quickly approaching offending aircraft. Finally, the correct 
decision regarding evasive action must have been selected and acted upon in 
timely and efficient manner. A runway incursion outcome score was calculated 
based on the quality, timing, and residt of pilot response to the offending aircraft. 
Scores ranged from zero to ten and were recorded by the researchers during the 
testing scenario. Any question over pilot response during the incursion scenario 
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could be ascertained by the digital records which logged ownship position and 
banking at one second intervals. A score of zero denoted a situation where the 
pilot did not notice the incursion. A score of ten indicated that the pilot noticed the 
incursion with adequate time to make a radio call and perform a safe maneuver to 
avoid the incursion with no dangerous loss of separation between aircraft. Scores 
at or below 6/10 suggest high risk for a poor outcome. The selection of 6/10 as a 
division point for risk was informed by subject matter experts in flight instruction 
and safety. This is considered a generous estimate, as one could argue that unless a 
pilot achieves a score of nine or ten the risk for a poor outcome is high. See Table 
12.2 for the description of the runway incursion management scoring system. 


Table 12.2 Incursion management scoring system 


Score Component 

Range 

Interpretation 

Radio Call 

0-1 

0= no/late radio call made 

1= call made with adequate time 

Time to Detect 

0-3 

0= no detection of runway incursion 
l=detected too late 

2=detected just in time 

3=detected adequately 

Time to Evasive Action 

0-3 

0= no evasive action 
l=evasive action too late 

2=evasive action just in time 

3=evasive action adequately 

Quality of Evasion Action 

0-3 

0=minimal separation or contact 
l=poor separation/evasion 

2=adequate separation/evasion 
3=high-quality separation/evasion 


Note: Scores at or below 6/10 are considered high risk for a poor outcome. The division of 
risk at 6/10 was selected as a method for interpreting the graphical results and was not used 
in statistical analyses. 


Cognitive tests 

The DriveABLE Cognitive Assessment Tool (DCAT) (DriveABLE, 1997), a 
cognitive screening tool for older drivers, was administered. The three sub¬ 
tests analyzed in this study were the motor speed and control (DCAT 1), span of 
attentional field (DCAT 2), and identification of driving situations test (DCAT 6). 
Age-normed scores reflected reaction time, attentional field of view, visual-spatial 
working memory, and short-term memory and decision making for brief auditory- 
visual video clips. The DCAT assessment was selected as a potential predictor of 
complex flight tasks because it has been validated as a predictor of performance 
for older drivers (DriveABLE, 1997). While differences exist between driving 
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and piloting an aircraft, the underlying cognitive factors of working memory, 
attentional field of view, processing speed, and decision making are common to 
both domains, and are purportedly measured by the DCAT (DriveABLE, 1997). 
To our understanding the current GA study is the first application of the DCAT as 
a measurement tool in the aviation domain. 

Subjective ratings of workload and situation awareness 

A modified version of the NASA-TLX (Hart and Staveland, 1988) was used to gauge 
subjects’ perception of workload in both difficulty conditions. The NASA-TLX was 
altered so that the scale contained ten gradations for response (reduced from the 21 
on the original TLX). Since the scale was new to most pilots a definition of each 
item (mental demand, effort, and so on) was provided in written form at the top of 
the page. There were seven categories of workload on a continuous 100mm rating 
scale: physical demand, mental demand, temporal demand, performance, effort, and 
frustration and overall workload (not in the original TLX). The present scores were 
not weighted. Subjective situation awareness was measured using a seven-point scale 
designed specifically for aviation tasks at the ACE Laboratory at Carleton University 
and was based on the scale reported by Hou, Kobierski and Brown (2007). The scale 
consisted of 11 items representing various aspects of situation awareness encountered 
during flight and included reference to awareness of the environment, tasks, time, 
priorities, awareness of how the perfect circuit should be flown, relative position of 
ownship and other aircraft, future events, and overall rating of situation awareness. 

Study results 

More than half of all the subjects (n = 48) in the present analysis achieved incursion 
management scores at or below 6/10 suggesting a high risk for a poor outcome. 
A frequent poor outcome included dangerous loss of separation between ownship 
and the rogue aircraft or proceeding to land on the runway. Fourteen pilots from 
the high-risk group (15 percent of the sample) failed to detect or respond to the 
runway incursion, and did not initiate the required go-around response. The 
remaining analyses examine the relationships between individual pilot factors and 
their runway incursion management scores. 

Pilot age, expertise, cognition, and runway incursion management 
Figure 12.1 depicts the relationship between age, expertise, and incursion 
management scores. Figure 12.1 shows mean incursion management scores 
according to low or high expertise (by combining the two lower pilot rating 
groups and the two higher pilot rating groups) and by pilot age (grouped according 
to decade, with the oldest group representing pilots aged 70 and older and the 
youngest group pilots less than 30 years). In Figure 12.1, a score of 6/10 reflects a 
high risk for a poor outcome. Age was clearly not linearly associated with incursion 
management scores (r= .09 ,p= .25) As shown in Figure 12.1, there was a general 
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trend for the youngest and oldest pilots from the low-expertise group to perform 
similarly and poorly with respect to detecting and avoiding the runway incursion. 
In the high-expertise group the oldest and second youngest group exhibited this 
pattern as well. These descriptive findings should be interpreted cautiously as 
the oldest age group contained only three pilots. In a secondary analysis a trend 
toward a cubic relationship between age (six groups) and incursion management, 
F( 3, 89)=1.97,j3=.12 was found. 



Age Groups (years) 

Figure 12.1 Mean incursion management scores by age and 
expertise groups 

In Figure 12.1, the horizontal line at 6/10 suggests point of high risk. Above 
the line is lower risk, and below the line is higher risk for a poor outcome. Note 
that the mean incursion score for all age groups in the low-expertise group resides 
in a high-risk zone (<=6/10). Student and private pilots with no additional ratings 
made up the low-expertise group. Private pilots with additional ratings and ATP, 
commercial, and military pilots comprised the high-expertise group. 

Figure 12.1 also shows the effect of expertise on incursion management. 
As seen in Table 12.3, pilot rating was significantly correlated with incursion 
management. A one-way ANOVA showed a significant overall effect of the four 
pilot rating groups, F( 3, 89)= 9.05, j9=.001, r| p 2 = .23. As revealed in Figure 12.1, 
high-expertise group means fell above 6/10, with the exception of the second 
youngest and oldest groups. The mean incursion management scores for the low- 
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expertise group were all below the suggested critical score of 6/10 for each age 
group. While all cell sizes are not adequate for a robust six (age group) by four 
(pilot rating group) analysis of individual pilot factors, the earlier description of 
the sample (see Table 12.1) indicated that lower pilot rating was significantly 
associated with fewer total and recent pilot-in-command hours. The interaction 
term between age and pilot rating was not significantly correlated with incursion 
scores, p=0. 46. 

To explore the impact of cognitive factors on incursion management, scores 
from the DCAT were examined as possible predictors of incursion management. 
As shown in Table 12.3, none of the DCAT tests correlated significantly with 
the incursion management score; however, DCAT 6, the test of identification 
of driving situations was marginally significant. To explore in more detail the 
effects of age on incursion management, interaction terms were calculated 
for age and each of the three cognitive measures. Only the age x DCAT 
6 interaction was significant, t= 2.38, p =.020. The interaction revealed that 
incursion management by younger pilots was not adversely affected by lower 
than age-expected DCAT 6 scores. This was not the case for the older pilots, 
whereby lower than age-expected scores on the DCAT 6 tended to result in 
poorer incursion management. 

To examine possible reasons for the lower performance of lower rated pilots, 
secondary correlation analyses were obtained for pilot rating and the three 
cognitive subtests. The span of attentional field subtest (DCAT 2) showed a 
significant positive relationship with pilot rating (r= .26, />=.014). 

Self-rated workload and situation awareness and runway incursion management 
Tables 12.4 and 12.5 show the correlation coefficients between items on the self¬ 
rating scales and incursion management. The situation awareness self-rating item 
pertaining to awareness of the impact of other aircraft on flying the “perfect” 
circuit (pattern) was significant (/K.001). Table 12.4 also includes the correlation 
for the corresponding item from the low-difficulty condition (the same scale was 
completed at the end of each difficulty condition). This correlation was tested 
to ensure that the self-awareness rating did not just reflect the outcome of the 
incursion scenario. The correlation between incursion management and self-rated 
awareness of the impact of other aircraft on flying the “perfect” circuit (obtained 
just after the low-difficulty condition where an incursion scenario did not unfold) 
did not reach significance, but did show a positive trend in the expected direction 
(r= .16,/;=. 13). Two additional self-rated features of situation awareness showeda 
similar association with incursion management scores: awareness of task priorities 
and position of ownship relative to other aircraft. The scale items representing task 
priorities and relative ownship did not contribute significantly to the model (p=.08, 
.09, respectively) and were not included in subsequent modeling. The NASA- 
TLX item pertaining to mental demands correlated with incursion performance 
whereby higher subjective estimation of workload was positively associated with 
higher (better) incursion management scores (p=.05). 


Table 12.3 Correlations between incursion management and individual pilot characteristics and neurocognitive 
test scores 


Age x 
DCAT6 

.20 

.053 

Identification 
of Driving 
Situations 
DCAT6 

.02 

.865 

Span of 

Attentional Field 
DCAT2 

.10 

.335 

Motor Speed 
and Control 
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.01 

.940 
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Hours 

.10 
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Other 
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.08 
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Other 
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Activities 

.09 

.379 

Predict 

Circuit 

Events 

SO 

co 

Impact of 
Other Aircraft 
on Perfect 
Circuit 
(low-difficulty 

condition) 

0.30 (.16) 

0.003 (.13) 

Own 

Relative 

Position 

OO 

.088 

Task 

Priorities 

OS 

.080 

The 

Perfect 

Circuit 

- 

.309 

Time to 
Complete 
Tasks 

-.01 

.953 

Tasks 

.08 

.428 

Navigation 

-.08 

.471 



p value 


Incursion 
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Table 12.5 Correlations between incursion management and self-rated 
task workload 




Mental 

Demands 

Physical 

Demands 

Temporal 

Demands 

Frustration 

Effort 

Performance 

Overall 

Workload 

Incursion 

Score 

r 

.21 

.02 

.07 

-.06 

.10 

.12 

-.03 


p value 

.048 

.859 

.499 

.545 

.327 

.273 

.792 


A model ofpredictors of incursion management 

Stepwise linear regression analysis was undertaken to determine if the factors that 
correlated significantly with the incursion management scores would contribute to 
a single predictor model. In order to thoroughly examine the interaction between 
all pilot rating groups and age as a continuous variable pilot rating, age (both 
centered), and an age x interaction term were entered into block one. The age 
x DCAT 6 interaction terms were entered into the second block. The self-rating 
scores for mental demand and awareness of the impact of other aircraft were 
entered into the third block of the model. The final model results are shown in 
Table 12.6. Pilot age and the age x pilot rating interaction term were not significant 
(t<l) and were excluded from the final regression model. 


Table 12.6 Significant predictors of simulated runway incursion 
management scores 


Linear Regression Coefficients 

Predictor 

Unstandardized 
Coefficients 
(Standard Error) 

Standardized 

Beta 

Coefficients 

t- 

value 

Pilot Rating 

1.24(36) 

.32 

3.44** 

Age x DCAT 6 

.039 

.02 

2.10* 

SA: Impact of Other Aircraft on Flight 

.56 (.23) 

.24 

2.46* 

TLX: Mental Demands of the Task 

.35 (.18) 

.19 

1.86* 


* significant at p <.05 ** significant at p= .001 


The regression model revealed that together, pilot rating, self-rated mental 
workload, and situation awareness for the impact of other aircraft on flight 
in the pattern, and the interaction between age and the DCAT 6 accounted for 
approximately 29 percent of the variance in runway incursion management score, 
F(6, 86) = 5.87, /K.001. The direction of the standardized Beta coefficients 
indicated that incursion management scores increased for pilots who correctly 
estimated the high level of mental demands in the high-difficulty condition and 
who rated their own situation awareness for the impact of other aircraft on their 
ability to fly the “perfect” pattern as high. In addition, pilots with higher levels 
of pilot rating tended to have better runway incursion management scores. The 
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unstandardized coefficients may be interpreted such that for each step up in rating 
(for example, level one, student to level two, private pilot) there is a corresponding 
1.24 increase in incursion management scores. Similarly, when the situation 
awareness and mental demand items were rated one level higher on the self-rating 
scales there was approximately a 0.6 and 0.4 increase in incursion management 
scores, respectively. The interaction between age and DCAT 6 can be understood 
as older age increased the magnitude of the relationship between the cognitive 
measure and incursion management scores. Specifically, demonstrating higher 
than age-expected scores for the DCAT 6 (identification of driving situations) was 
associated with better incursion management for older, but not younger, pilots. 

Summary 

This discussion has been focused on individual pilot characteristics as predictors 
of incursion management. Both the literature and the findings of this study indicate 
that pilot age, expertise, and cognitive factors are implicated in the capacity of 
pilots to manage a surprise incursion upon landing. The relationship of age to 
performance was not linear and appeared to interact with pilot rating and cognition. 
The significant correlation between incursion management and the interaction 
term for age and DCAT 6 demonstrated that older pilots with lower age-normed 
DCAT 6 scores also tended to have poor incursion management scores. The DCAT 
6 subtest required that the pertinent visual and auditory information from a scene 
be quickly integrated and interpreted so that critical decisions regarding oneself, 
as the hypothetical driver in the scene, could be made. As just described, there 
are clear similarities between the attention, stimuli selection, and perceptual- 
motor mechanisms required to respond correctly on the DCAT 6 and the cognitive 
mechanisms needed for adept handling of the surprise incursion. To the authors’ 
knowledge, the results of this study are the first indication that a test designed to 
predict older at-risk drivers might also have applications in the aviation domain. 
Because the DriveABLE test was designed for and validated with an older 
population the interpretation of scores for younger pilots should be undertaken 
cautiously. 

Pilot rating was clearly an important predictor in incursion management, as 
evidenced by the results of the linear regression. Pilot rating also interacted with 
age so that when age was grouped according to decades an inverted u-shaped 
pattern emerged which suggested that the lowest scores in the low-expertise group 
were found with the oldest and the youngest pilots. In the high-expertise group 
the oldest pilots again exhibited the lowest scores, but the second to youngest 
group demonstrated low scores similar to the oldest group. The youngest high- 
expertise pilots achieved scores similar to the middle-aged high-expertise groups. 
High expertise appears to benefit the youngest pilots in a way not seen in the oldest 
pilots. The oldest highest-rated pilots had lower scores as compared to the middle- 
aged pilots and did not appear to benefit from high expertise to the same degree as 
the youngest pilots (oldest high-expertise pilot mean scores were in the high-risk 
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zone). These results are supported by previous findings showing that age-effects 
on pilot performance were typically not moderated by pilot ratings (Taylor et al., 
2007; Yesavage et al., 2011). Replication of these findings with larger older pilot 
groups is needed. 

The positive association between pilot rating, total flight hours, and recent 
flight hours might have some role in the better performance seen by the higher¬ 
rated pilots. The higher-rated pilots also tended to have higher age-normed scores 
on a cognitive test measuring attentional field of view. The attentional field of 
view test reflects the capacity for attending to relevant and inhibiting irrelevant 
visual stimuli and then recalling the locations of relevant stimuli from short-term 
memory. An attentional field that is capable of handling multiple stimuli at a rate 
above age norms might certainly be an advantage when encountering a surprise 
runway incursion. 

Measures indexing pilot self-rating for situation awareness pertaining to 
the impact of other aircraft on one’s own flight and for the mental demands of 
the simulated flight tasks were also significant predictors of the outcome of the 
surprise incursion upon final approach. The predictive function of subjective 
ratings of situation awareness and task workload on performance managing 
surprise events was considered an important element of the present study because 
of the strong reliance upon pilot self-evaluation in GA competency evaluation 
procedures (Federal Aviation Administration, 2006). Findings suggest that 
accurate representation of mental demands during flight and good self-ratings 
of situation awareness for traffic in the pattern might also predict capacity for 
handling surprise runway incursions. These results support the idea that self- 
awareness can assist pilots in understanding their own possible risk should they 
encounter a surprise runway incursion, and how that risk might change over the 
course of their flying experience. 

In sum, this study suggested that lower-rated younger and older pilots were 
most at risk for a poor outcome when encountering a simulated runway incursion 
upon landing. We also reported the novel finding that one DCAT (DriveABLE, 
1997) subtest, measuring identification of relevant auditory and visual features of 
a scene, was associated with incursion management for older pilots. Future studies 
replicating these findings and examining other cognitive mechanisms which might 
predict incursion management can be used to inform runway safety policies and 
programs. A better understanding regarding which pilots might benefit most from 
runway incursion safety strategies can improve cost-benefit ratios and deliver 
timely information to those perhaps at risk when encountering one of the many 
incursions which occur suddenly and without warning. 
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Three-Body Problem 1 
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Applications of cognitive research to the design of more effective operational 
and training systems have been informed by two alternative perspectives on 
semiotics—the dyadic and the triadic perspectives. We will make the case 
that the triadic perspective provides a more comprehensive framework for 
cognitive science, which is especially valuable when considering the pragmatic 
implications for designing socio-technical systems (for example, designing virtual 
environments for training). However, the introduction of a “third body” into the 
cognitive system raises important challenges for both science and application. In 
this chapter, we will consider some of the challenges of the triadic (three-body) 
system and will suggest how synthetic task environments can help researchers 
address these challenges. 


Semiotics 

The theoretical context for cognitive science and for its application to the design 
of socio-technical systems was strongly influenced by the field of semiotics. 
Semiotics is typically described as the science of signs, but it can also be described 
as the science of meaning making. That is, the focal question of semiotics is how 
meaning is attributed to signs or representations. Ferdinand Saussure and Charles 
Sanders Peirce are typically credited with independently founding the field of 
semiotics (Eco, 1979, Morris, 1971). However, they approached the problem from 
two distinct perspectives. 


1 Distribution A: Approved for public release; distribution unlimited. 88ABW 
Cleared 12/18/2013; 88ABW-2013-5380. 
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Saussure s dyadic semiotic system 

Saussure, generally regarded as the father of linguistics, framed the semiotic 
system in terms of the dyadic relation between a sign/symbol and an agent/ 
observer, as illustrated in Figure 13.1. Saussure’s interest was particularly in the 
evolution of alphabets and languages. Thus, he viewed the semiotic problem 
from the perspective of assigning meaning to symbols (for example, written or 
spoken language). This framework fit ideally with the computer metaphor of mind 
and it set the stage for the first wave of cognitive science and the information¬ 
processing approach to cognition and design. In this context, the cognitive agent 
was considered to be a symbol processor and the focus of basic research was on 
exploring the internal information processing constraints (for example, channel 
capacity and internal recoding). The focus for application of this approach involved 
characterizing the internal information constraints so that these constraints could 
be considered in designing cognitive work (for example, avoid overloading the 
limited capacity working memory). 



Figure 13.1 Saussure’s dyadic model of the semiotic system 

In applying the dyadic approach to socio-technical systems, it was natural 
to focus on the coherence between the surface structure of the interface (that is, 
the symbol or representation) and the responses or interpretations of the human 
operator. Research hypotheses in this paradigm were typically framed in terms 
of the coherence between general surface properties of the interface and the 
information-processing demands. Classical examples include early work on 
shape coding to improve discriminability among different controls (Jenkins, 
1947) and work on stimulus-response compatibility that looked at the coherence 
of the spatial topology of the display representation relative to the spatial 
topology of the response (Fitts & Seeger, 1953). More recently, attention has 
been given to the organization or clustering of information in the display (for 
example, intergral versus separable displays), relative to hypothetical information 
processing limitations (for example, parallel versus serial processing) (Wickens & 
Carswell, 1995). In all these instances, hypotheses about the relative effectiveness 
of alternative representations were often tested using generic laboratory tasks 
motivated by assumptions about the relevant information processes. 
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Peirce s triadic semiotic system 

Peirce, the father of Pragmatism, was interested in the pragmatics of belief and 
action in the world. How is it that our beliefs about the world can become the basis 
for successful action in the world? Thus, Peirce brought a third component into 
the semiotic system. In essence, the third component reflects a source behind the 
sign or representation—that is, a problem domain or a natural ecology. By adding 
this third component, Peirce brought two additional relations into the semiotic 
system. In addition, to the coherence between the sign and the expectations of the 
agent considered in the dyadic system, the triadic system involves the structural 
mapping between the sign and the source domain and the correspondence between 
the agent’s beliefs about action and the actual consequences of action in that source 
domain as illustrated in Figure 13.2. 



Figure 13.2 Peirce’s triadic model of semiotics introduces a third “body” 
into the system 

In the triadic model, the semiotic problem changes from interpreting a 
symbol to adapting to the demands of a problem domain. Rather than the symbol 
or representation being the “stimulus,” it becomes simply a medium, with the 
stimulus displaced to the problem domain. The ultimate test of the triadic system 
is not whether the representations match the agent’s expectations and beliefs, but 
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rather whether the agent’s expectations and beliefs support successful interactions 
with the problem domain. Attention shifts from the syntax of the surface features 
of the interface representation to the semantics associated with the deep structure 
of the problem domain. Additionally, the pragmatic design goal is to shape the 
agent’s expectations through training and/or interface design in ways that lead to 
productive interactions with the problem domain. 

Note that in the triadic semiotic system, the USER-centered concerns 
associated with the coherence between the interface and agent expectations remain 
an important component of the semiotic system. However, the triadic model also 
raises additional USE-centered concerns about the relations between structure in 
the representation and the functional constraints associated with the target problem 
domain (Flach & Dominguez, 1995). In the context of the triadic model, the design 
challenge shifts from “ matching” the agent’s mental model, to “shaping” the 
agent s mental model so that it supports productive action with regards to a target 
problem domain. 


The Three-Body Problem 

As physicists know, modeling the motion of interacting bodies in space becomes 
significantly less tractable when a third body is introduced. This is one of the 
major attractions of the dyadic approach to semiotics. Using the dyadic framework 
the image guiding research was that of a communication channel and problems of 
cognition were reduced to open-loop, symbol-processing problems, constrained 
only by internal information-processing limits as illustrated in Figure 13.3A. In 
this context, research questions became significantly more tractable in terms of 
identifying simple causal relations between stimuli and responses. This allowed the 
use of simple laboratory paradigms motivated by information processing models 
for independent stages of processing. The laboratory tasks typically required no 
special knowledge so that general populations of readily accessible participants 
could be studied. Thus, large-N studies were feasible and it was possible to use 
strong statistical inference to judge effects. 

In contrast to the communication channel metaphor, the triadic model of 
semiotics suggests a dynamic closed-loop coupling between perception and action 
as illustrated in Figure 13.3B. This reflects an abductive logic where the “tests” 
of beliefs are the practical consequences from acting on those beliefs. In this 
dynamic, the “sign” interface has a dual function in terms of action/control (that 
is, comparing the difference between consequences and intentions— error) and 
perception/observation (that is, comparing the difference between consequences 
and expectations— surprise). This leads to a self-organizing dynamic where 
the cognitive agent is simultaneously shaping actions and being shaped by the 
ecological consequences of those actions. To understand the dynamics of the triadic 
system it becomes necessary to understand the constraints associated with the work 
domain or problems space (that is, deep structure) and the potential interactions of 
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Consequence Intention/Expectation 



Figure 13.3 Introducing a third body changes the underlying dynamics 

from open-loop (that is, causal) in (A) to a closed-loop (that is, 
self-organizing) in (B) 

these constraints with the internal constrains (that is, mental models) of the human 
agents in relation to observation and control. In the following sub-sections some 
implications for approaching the triadic semiotic system are considered. 

Cognitive task versus work domain analysis (WDA) 

As reflected in the images of the triadic semiotic system, a necessary step in a triadic 
approach is to bring the work ecology into the research frame. Thus, a prerequisite 
is to identify the deep structure of that ecology. This is the goal of Work Domain 
Analysis (WDA) (Vicente, 1999; Naikar, 2013). To set the context for this, it is 
important to distinguish WDA from Cognitive Task Analysis (CTA) (for example, 
Fleishman & Quaintance, 1984). CTA has typically been designed to reflect the 
information-processing activities associated with the work. This makes perfect 
sense from the dyadic perspective where the focus was on cognitive activities 
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inside the head of the human agent. For the CTA perspective, explanations are 
based on a causal logic, where the time history of activities is traced backward in 
order to discover the root cause or prime mover. 

In contrast, the focus of WDA is on the functional constraints associated with 
the problem domain. For example, in aviation this includes the aerodynamic 
constraints on vehicle motion, situational factors within the airspace (for example, 
weather), the regulatory constraints on airspaces, as well as the value constraints 
against which safety and efficiency are measured. The goal is to better understand 
the “deep structure” of the problem. In the WDA perspective, explanations are 
based on a field logic, where behaviors emerge as the result of interacting fields of 
constraints. Kirlik (1995) provides an excellent pedagogical example of how this 
field logic works to explain adaptive behavior. 

An important aspect of WDA is the realization that the constraints that shape 
behavior in work domains arise at multiple functional levels of abstraction (for 
example, Rasmussen, 1986) and at multiple organizational levels (for example, 
Leveson, 2011). Rasmussen’s Abstraction Hierarchy (AH) provides one formalism 
for thinking about different functional levels within a means-ends description of 
work constraints. Leveson’s Systems-Theoretic Accident Model and Processes 
(STAMP) model also provides a guide for tracing the control loops through multiple 
levels within organizations. The clear implication of Rasmussen and Leveson’s 
approaches to work analysis is that the focus needs to expand to consider the larger 
socio-political-organizational context within which work is done. 

Representative design of experiments 

Research motivated by the dyadic approach is often designed to isolate variables 
associated with specific internal information-processing stages. Thus, the choices 
of tasks and independent variables are typically motivated by models of the internal 
stages. Even when the research is conducted within high fidelity simulations (for 
example, a flight simulator), the research will often utilize secondary tasks (for 
example, memory search or probe reaction time) in order to tap into the relevant 
internal mechanisms. 

In a triadic approach, however, the focus is on how performance is shaped 
by the deep structure of the problem domain. Thus, the tasks and independent 
variables are explicitly chosen to reflect that deep structure. This requires that the 
evaluation context be representative of the work domain (for example see Kirlik, 
2006 for a collection of papers exploring the implications of Brunswik’s call for 
representative design of experiments in relation to research on human-technology 
interaction). It is important to appreciate that representativeness does not simply 
mean that the interface (for example, knobs and dials) functions properly (for 
example, as in a high fidelity flight simulation). It also involves the validity of 
the problem that is driving the interface—that is the dynamics of the problem 
context that the research is intended to generalize to. So, for example, the triadic 
approach requires that the experimental situations or context are representative of 
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the target domain. For example, in evaluating a design of new technologies for the 
next generation of air space management systems, it would be important that the 
evaluation contexts involve conditions that would be representative of future flight 
ecologies (for example, in terms of air traffic densities, regulatory constraints, 
information technologies, and aircraft capabilities). 

In addition to care in selecting the experimental task scenarios, it also becomes 
important to select participants from representative populations. For example, one 
cannot simply select a participant from an Introductory Psychology course and 
expect him or her to be able to fly a simulated aircraft under realistic air traffic 
conditions. Thus, the triadic approach demands care in selecting participants who 
have the appropriate skills and experience to address the problems presented. This 
raises the issue of competency. 

Mission Essential Competencies 

The construct of Mission Essential Competencies (MECs)™ has emerged in 
the context of training applications and research (Alliger, Beard, Bennett, Jr. 
& Colegrove, 2012; Alliger, Beard, Bennett, Jr., Colegrove & Garrity, 2007). 
In contrasting the dyadic and triadic approaches, the key distinction reflected 
in this construct is a shift from focusing on generic information constraints to 
focusing on “mission relevant” abilities, skills, experience, and knowledge. Thus, 
the construct of competencies focuses on the deep structure of work in terms of 
demands for success in a specific work domain. A key distinction of this approach 
is its emphasis on the specification and rationale for a set of “learning experiences” 
that are central to the development of the abilities, skills, and knowledge needed 
for job/mission success. For example, with respect to air combat, Colegrove and 
Alliger (2002) define a MEC as “higher-order individual, team, and inter-team 
competency that a fully prepared pilot, crew, flight operator, or team requires for 
successful mission completion under adverse conditions and in a non-permissive 
environment.” In this context the coupling of the development experiences with 
the needed understanding is a critical contribution to the discussion in this section. 
In essence, constistent with the triadic approach, the MEC™ construct situates 
or grounds the properties of the cognitive agent (that is, awareness) relative to 
specific demands of a work domain (that is, situations or experiences) and thus 
provides a triadic basis for making decisions for designing training scenarios and 
goals as well as defining criteria that can be used to assess achievement of learning 
and goals. 

Ecological interfaces 

Training reflects one path for shaping the internal models of operators so that they 
better correspond with the deep structure of specific problem domains leading 
to more productive actions. Another means for shaping the internal models of 
operators is through the design of interface representations (Bennett & Flach, 
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2011; Rasmussen & Vicente, 1989). The construct of Ecological Interface Design 
(EID) provides a triadic alternative to the conventional dyadic approach that 
tends to emphasize matching generic internal models (for example, population 
stereotypes), rather than shaping internal models so that they better correspond 
with the demands of specific work domains. The emphasis of the EID approach 
is on designing display constraints (for example, configural visual graphics) that 
are explicitly mapped to the underlying deep structures of the work domain. In 
this context, the emphasis shifts from focusing on capacity limitations to focusing 
on skills such as chunking that allow experts to by-pass these limitations in order 
to meet the demands of complex tasks (for example, Chase & Simon, 1973; 
Ericsson & Charness, 1994). For example, research on chess suggests that the 
ability of chess experts to remember board positions and to quickly focus on 
good alternative moves reflects a different way of chunking information. Novices 
focus on individual “pieces” and experts focus on the spaces that the pieces are 
attacking (Reynolds, 1982). Thus, structure in configural graphics is designed 
to bias operators toward organizing (that is, chunking) information in ways that 
support productive thinking or expertise. 

Cross-disciplinary’ collaboration 

Finally, a major challenge of the three-body problem is that no single perspective 
on the system is privileged. That is, the system no longer fits within a single 
disciplinary perspective. The terms currently being used to reflect the scope of the 
problem are system of system, federations of systems, and socio-technical system 
(for example, Sage & Cuppan, 2001). The problem is that there is no single socio- 
technical scientific discipline. Addressing the three-body problem requires real 
collaborations that include multiple types of expertise. These include, but are not 
limited to, domain experts (for example, what should be done?), technical experts 
(for example, what’s technically possible?), psychological and physiological 
experts (for example, what’s humanly possible?), social experts (for example, 
what is collectively possible?), cultural experts (for example, what’s socially 
acceptable?), and so on. The difficulty of managing such multi-disciplinary 
research teams is monumental due to conflicting languages and value systems. 
Success of such teams can depend on joint commitment to solving a pressing 
functional problem (for example, the cybernetic and atomic research programs 
during the Second World War) and a common organizing framework (for example, 
Cognitive Systems Engineering) for integrating the various perspectives. 


Synthetic Task Environments 

The previous section illustrated some of the ways that the addition of the 
third “body” to the semiotic system changes the questions that become most 
interesting for researchers. A clear implication of this shift for research is that 
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it becomes necessary to incorporate the deep structure of specific work domains 
into the experimental contexts. Fortunately, information technolgies such as 
high fidelity simulators and virtual environments provide one means to do this. 
These technologies allow researchers to build or utilize synthetic enviroments 
that represent the deep structures of specific work domains with a relatively high 
level of fidelity. While bringing more of the richness of natural work domains into 
the laboratory, these synthetic environments offer possibilities for manipulation 
(for example, exploring low-probability events) and replication of conditions 
(for example, repeating fixed initial conditions and replaying identical scenarios 
with multiple participants) that would not be possible in naturalistic settings. 
Additionally, these environments typically allow unobtrusive measurement of 
both the situation (that is, independent variables) and operator performance (that 
is, dependent variables) in ways that often are not possible in natural settings. 
Schreiber, Schroeder, and Bennett, Jr. (2011) describe the difficulties in this area 
directly in their discussion of within-simulator training effectiveness in distributed 
synthetic environment contexts. 

The measurement problem 

The ability to simultaneously measure properties of the changing situation 
and the performance of operators at multiple levels of abstraction is both an 
opportunity and a challenge for researchers using synthetic task environments. 
On the opportunity side, one of the biggest challenges for conventional research 
focused on generic information-processing tasks was to relate statistically 
significant differences observed in laboratory tasks to practical differences in 
specific work domains. Would a significant laboratory effect on reaction time 
translate to a practical difference in operational effectiveness? Synthetic task 
environments provide a means to address this question empirically. That is, within 
a synthetic task environment it is possible to simultaneously measure micro-level 
performance differences (for example, reaction time to a specific display event) 
and more macro-level functional differences (for example, winning or losing an 
engagement) (see Schreiber et al., 2009). 

Comparisons across levels of abstraction provide empirical evidence about 
whether differences at the micro-level (for example, in terms of specific actions 
or specific design alternatives) are correlated with success at the macro-level (for 
example, in terms of success at the operational level). For example, reaction times 
of pilots to alternative traffic warning displays can be related to the actual numbers 
of collisions that occur as a function of the display types. Thus, the question about 
the operational effectiveness of a significant difference in reaction time can be 
empirically addressed. By measuring performance simultaneously at multiple 
levels, synthetic task enviroments can allow patterns at one level to be empirically 
evaluated relative to patterns at other levels. Such measurement opportunities can 
provide a bridge between practice and theory that will lead to improvement on both 
ends. This bridge is particularly important for complex, nonlinear systems where 
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analytical linear extrapolations fail, and insight typically depends on empirically 
linking quantitative changes at the micro-level with qualitative changes at the 
macro-level (for example, Shaw, 1984). 

A significant challenge for research using synthetic task environments is 
data overload. The opportunity to measure everything can make it harder to see 
anything. Based on our experiences, we venture the guess that many research 
programs using synthetic task environments have oodles of data that get archived, 
but that are never analyzed or examined in a systematic and comprehensive 
way. In order to take advantage of the data that synthetic environments make 
available to researchers, it may be essential that the search of that data is guided 
by theories about the deep structure of the work domain, about the domain-specific 
competencies required, and about the generic constraints on awareness. Thus, the 
solution depends on clever partitioning of the problem and the use of converging 
operations to discover and isolate signals (for example, patterns associated with 
fundamental properties) that are embedded in the complexity. The search for 
patterns in the data generated by synthetic task environments must be guided by 
basic theories of complex systems in general, and cognitive systems in particular. 
Consequently, basic theory will be critical for gaining insights into the possible 
empirical relations that might be functionally relevant. 


Conclusion 

The fundamental point that we hope to leave the reader with is that the shift from 
a dyadic semiotic perspective to a triadic perspective is not a simple matter of 
adding two plus one. The shift in perspective requires fundamental changes in the 
logic of explanation—from billiard ball causal models to a logic based on fields 
of interacting constraints (for example, see Dekker, 2011). It requires a shift from 
activity-based models of work (that is, task analysis) to constraint-based models of 
work (that is, WDA). It requires increased attention to the external validity (that is, 
representativeness) of our theoretical constructs and empirical research paradigms, 
it requires changing our perspective on design from a user-centered perspective 
based on matching existing operator expectations (for example, internal models), 
to a use-centered perspective with the goal to shape user expectations so that 
they are more congruent with pragmatic demands of specific domains and the 
opportunities afforded by advanced technologies. 

We see synthetic task environments as an opportunity for meeting some of the 
challenges raised by the triadic perspective. However, it is important to understand 
that we are not using the term synthetic task environments to refer to a specific 
type of technology (for example, virtual reality). Rather, the construct of synthetic 
task environments reflects a commitment to do research that better represents the 
functional demands of operational worlds. It reflects a joint commitment to the 
values of basic and applied science as reflected in Pasteur’s Quadrant (Stokes, 
1997). That is, the commitment is to a rigorous scientific approach that has clear 
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practical implications for improving the performance of socio-technical systems 
such as future airspace systems. While any observation, whether naturalistic 
or experimental, is necessarily reductionistic, the fundamental challenge for 
synthetic task environments is to partition the problem in ways that preserve the 
integrity of the constraints that shape performance. The challenge is to achieve 
levels of control to allow defensible conclusions without trivializing the complex 
functional dyanmics of the target work environment. 
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Emergence of a Need: Reinforce Pilots’Ability to Handle Complex and 
Unforeseen Situations 

Manage complex and unforeseen aeronautical situations 

During the last decade, military crews have been faced with a number of changes 
due to the ever-increasing complexity and diversity of modern military operations 
and systems. In this context, a need to reinforce their skills in dealing with complex 
and unforeseen situations has arisen. Pilots’ expertise can no longer be limited to 
the mastery of knowledge and skills allowing them to handle expected situations, 
whether these are normal or accidental. Complex and unforeseen situations 
cannot be processed solely on the basis of fast associations and easily applicable 
procedures. Rather, they require an adaptive, even a creative, use of knowledge 
and skills. If adaptation is lacking, pilots are at risk of responding ineffectively, of 
showing acute stress reactions, and as a result, of endangering safety. 

Reports from the French Defense Air Accident Investigation Board (for 
example, BEAD-Air, 2004, 2006) have indicated difficulties for pilots in 
dealing successfully with complex and unforeseen situations. Recognizing and 
understanding the stakes of unusual and complex situations may be challenging. 
Pilots may also have difficulties making decisions which take into account all of 
the constraints associated with the situation. On the whole, these reports point to 
a difficulty recognizing abnormal situations and approaching these situations with 
an adequate state of mind and adequate cognitive tools. 

Casner, Geven, and Williams (2013) presented airline pilots with abnormal 
events under: (1) the familiar circumstances used during airline training; or (2) 
unexpected circumstances, as might occur during a flight. The results showed that, 


230 


Advances in Aviation Psychology 


for approximately one-third of the pilots, performance was severely hampered 
when the event occurred in unusual circumstances. Bourgy (2012) found a similar 
proportion of failures in a recent simulator study in which fighter pilots faced an 
unforeseen situation: one-third of the pilots was unsuccessful to grasp innovatively 
the dysfunctions that they encountered, which led them to eject in a rushed and 
dangerous manner. Only two-thirds of the pilots avoided such an unsatisfactory 
ending owing to their use of adaptive solutions. 

Therefore, it seems especially important to supplement training programs for 
pilots with specific knowledge and tools targeting adaptation to, and management 
of, complex and unforeseen situations. 

Adaptability, a transversal skill 

Reaching high levels of performance and safety in all situations is of fundamental 
importance in risky work environments such as aeronautics, the nuclear industry, 
or medicine. The issue of how to maintain adequate performance when faced with 
unusual situations is thus currently a focus of attention in these organizations. 
Operators must not only be able to effectively manage expected situations; they must 
also be able to handle unforeseen situations. The High Reliability Organizations 
(Weick & Sutcliffe, 2001) and the Resilience Engineering (Hollnagel, Woods 
& Leveson, 2006) movements seek to define the organizational conditions that 
make such adaptability possible. Studies have revealed the necessity to couple 
anticipation dynamics (by preparing individuals for the likely situations) with 
adaptation dynamics (by preparing individuals to respond to the special features of 
the situations actually encountered) (Bigley & Roberts, 2001; Hollnagel & Woods, 
2006; Weick, Sutcliffe & Obstfeld, 1999). 

Adaptability of an organization obviously relies on its managerial structures, 
but it also depends to a large extent on the expertise and the autonomy of its actors. 
The articulation between these two analysis levels (organizational and individual/ 
collective) is the object of greater attention today than it was before. For example, 
the French Air and Space Academy recently organized a colloquium entitled "Air 
transport pilots facing the unexpected" (AAE, 2013), which aimed at reviewing 
ways of improvement for the management of complex and unexpected situations 
at the organizational, collective, and individual levels. 

In this context, "adaptability" tends to emerge today as a transversal 
professional competence. The term denotes "an individual’s ability, skill, 
disposition, willingness, and/or motivation, to change or fit different task, social, 
and environmental features" (Ployhart & Bliese, 2006, p. 13). Adaptability is 
seen as a composite factor that depends on personality traits, cognitive skills, and 
domain-specific knowledge (White et al., 2005). It would allow individuals to 
respond more effectively in emergency or crisis situations, or in situations which 
(1) are uncertain, unpredictable or stressful; (2) require a creative response or 
learning of new tasks; or (3) involve cultural or interpersonal adaptation (Pulakos, 
Arad, Donovan & Plamondon, 2000). 


Enhancing Management of Complex and Unforeseen Situations Among Pilots 231 


Strengthen cognitive adaptation skills: toward a new paradigm 

As far as cognitive skills are concerned, adaptability requires “cognitive adaptation,” 
that is an ability to activate and coordinate creatively acquired knowledge and 
skills, so as to propose relevant responses and to maintain emotional balance 
when facing changes or challenges. Achieving cognitive adaptation involves 
gathering situational cues, noticing patterns, activating relevant knowledge and 
heuristics, adapting strategies, and learning from the results of action (Ployhart 
& Bliese, 2006; Schunn & Reder, 1998). However, more specifically, cognitive 
adaptation needs other skills, namely knowing how: (1) to grasp relevant aspects 
of a situation even when those aspects are not salient; (2) to mobilize knowledge 
and skills in a creative and adapted manner; and (3) to keep ready to change one’s 
mind. Operators’ adaptive abilities are considered differently in different fields. 
Organizational sciences just promote them, whereas occupational psychology 
tries to measure and predict them. Applied cognitive psychology and ergonomics 
seek to characterize the mechanisms involved in mobilizing these abilities and to 
maintain/reinforce them in order to better fulfill task requirements. 

The design of training programs for the management of complex and 
unforeseen situations depends critically on the underlying model, which describes 
the mechanisms and conditions of cognitive adaptation. Currently, much 
training is based on a model of cognitive adaptation that gives a central place to 
cognitive control, that is, to conscious and deliberate regulation processes used by 
individuals to check the validity of their representations and cognitive processes. 
We underscore the limitations of this model in the next section. Then, we consider 
new types of training that offer a complementary approach for modeling cognitive 
adaptation. This approach underlines the importance of the state of mind in 
which cognitive processes are deployed. Emphasis is placed on the central role 
of open-mindedness and acceptance, in cognitive adaptation. Lastly, we discuss 
some implications of this new approach of cognitive-adaptation training for the 
improvement of pilots’ ability to manage complex and unforeseen situations. 


Trainings Based on Cognitive Control Enhancement 

Improve cognitive adaptation by reinforcing cognitive control 

Cognitive and clinical psychology studies assert that cognitive and emotional 
adaptation are based on cognitive control (Hoc & Amalberti, 2007; Ochsner & 
Gross, 2005). Cognitive control is thought of as a metacognitive function, devoted 
to the supervision of the relevance of actions, representations, or thought processes, 
and, if necessary, to their modification. Cognitive control requires: 

• executive processes which allow individuals to maintain, or shift the focus 
of their attention, to maintain, or modify their goals, and to inhibit automatic 
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responses (Miyake et al., 2000; Suchy, 2009); executive processes may 
explain inter-individual differences in decision-making performance and 
emotion regulation (for example, Del Missier, Mantyla & Bruin, 2012; 
P. G. Williams, Suchy & Rau, 2009, respectively); 

• metacognitive knowledge and skills; 

• reflexivity, that is, the ability to detach from one’s own activity in order to 
observe, evaluate, and correct it. 

Numerous training programs seek to improve decision making or stress 
management by enhancing these three cognitive control abilities. 

Training programs that seek to improve decision making 

At least five categories of decision-making training are based on the principle 
of cognitive control enhancement. Training in the first category is based on the 
acquisition of general decision-making methods. These methods function as 
“check-lists,” which remind operators about key-points in the decision process, 
and prompt them to check that they have not forgotten one step in the decision 
process (see for instance Aeronautical Decision-Making Training, Li & Harris, 
2008). Training in the second category aims at helping operators to make decisions 
in specific situations. It does not provide a complete and systematic method, but 
rather, a set of general heuristics or rules (see for instance Heuristic Rule Training, 
Sauer, Burkolter, Kluge, Ritzmann & Schiller, 2008). Training in the third 
category provides operators with a formalized questioning scheme for looking at 
the relevance of their cognitive processes and representations in a situation (see 
for instance Critical Thinking Instruction, Helsdingen, van den Bosch, van Gog 
& van Merrienboer, 2010). The fourth category of training does not aim to teach 
pre-established content or questions to operators. Rather, the basic principle of 
this training is to familiarize operators with reflexivity in order to lead them to 
think critically upon their practices (see for instance Decision-Making Training, 
Chauvin, Clostermann & Hoc, 2009). Lastly, a fifth category seeks to enhance 
executive functioning, which is the necessary support for cognitive control (see 
for instance attention-management training such as Emphasis Shift Training, 
Burkolter, Kluge, Sauer & Ritzmann, 2010). 

Training programs that seek to reduce stress 

The intentional emotion-regulation strategy referred to as “cognitive change” 
(Gross, 2002) involves changing how we think about a situation, and inhibiting 
inadequate automatic thoughts (Clark & Beck, 2010). It is widely used in cognitive- 
behavioral therapies (CBT). Numerous studies have demonstrated the efficacy of 
CBT and associated cognitive-change techniques on stress and negative-affect 
management, both in patients (for review, Butler, Chapman, Forman & Beck, 
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2006) and in healthy individuals (for example, Herwig et al., 2007), and also in 
occupational settings (for review, Richardson & Rothstein, 2008). 

In the military, CBT is commonly used to treat post-traumatic stress disorders 
but only a few studies have reported on the introduction of CBT training to prevent 
stress. R. A. Williams et al. (2004) showed that nine 45-min sessions of CBT 
could improve the psychosocial comfort (measured in terms of sense of belonging, 
loneliness, problem-solving coping, and attachment) of Navy recruits who were at 
risk for depression (ARD), but that it did not actually reduce depressive symptoms 
and perceived stress. A second study (R. A. Williams et al., 2007) involving navy 
recruits (ARD and not ARD) showed similar results for psychosocial comfort. 
Moreover, during a high-stress training period, more recruits from the CBT group, 
compared to the control group, completed successfully the training. Cohn and 
Pakenham (2008) tested the efficacy of a “lighter” CBT intervention (involving 
only two 40-min sessions) on Army recruits. They found positive benefits 
on psychological adjustment, with an increase in positive states of mind and a 
decrease in distress. 

Reinforce control: a costly strategy with limitations 

By improving their ability to control the relevance of their thought processes and 
contents, the decision-making and stress-management training described above helped 
operators to manage complex, dynamic, and risky situations. However, the tools that 
are proposed have several limitations. First, the different heuristics or questioning 
schemes are not usually generalizable to all situations. This raises two issues: 

• How can one ensure that operators are able to choose the relevant heuristic 
or scheme for the situation? 

• How should operators manage situations that are not taken into account 
in schemes? 

In low-complexity, familiar situations, or in training situations where the 
instructor clearly indicates what the relevant aspects and priorities are (Burkolter 
et al., 2010), individuals know a priori what is relevant for them to control. 
Accordingly, they can concentrate on the question: “how to control?,” and therefore, 
on cognitive control. However, in complex, real, and unexpected situations, it is 
difficult to determine, rapidly and precisely, based on prior knowledge or cues, 
what to focus attention and control capacities onto. “What to control?” then 
becomes the key question. 

Moreover, cognitive control taxes attentional and executive resources, which 
can lead to cognitive overload as well as increased stress and fatigue, and thus be 
counter-productive (Li & Harris, 2008; Sauer et al., 2008). In addition, because 
cognitive resources are limited, operators will choose what they control, but their 
choices may not be relevant. To overcome this limitation, one may seek to increase 
operators’ executive capabilities (see for instance, Burkolter et al., 2010). However, 
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some authors have questioned whether these capabilities are liable to training (for 
discussion see, Jaeggi, Buschkuehl, Jonides & Perrig, 2008). Moreover, there 
might be ceiling effects in the population of pilots. Therefore, as discussed in the 
next section, in complex and unforeseen situations, cognitive adaptation seems to 
require more than just cognitive control. 


New Trends in Cognitive-Adaptation Training 

Recent developments in cognitive and clinical psychology have opened new 
paths for understanding cognitive-adaptation mechanisms and for enriching the 
principles of complex-situation management training, which have traditionally 
been based on cognitive control enhancement. Studies on thinking dispositions 
have demonstrated that the state of mind influences the quality of the representations 
and processes used. These studies open new avenues for training. We outline one 
example of this below. Other studies, which relate to the notion of mindfulness, 
go in the same direction and emphasize the positive role of open-mindedness and 
acceptance on adaptation. A brief review of recent mindfulness works in risky 
professional environments is proposed. 

Training to improve cognitive-adaptation integrating thinking dispositions 
Thinking dispositions 

Complex and unexpected situations are often ill-defined. Individuals must first 
structure the situation and assign some meaning to it in order to determine which 
of its aspects are relevant to process or control. In other words, they must ask 
“what” to control in addition to “how” to control. These two aspects of adaptation 
have been formalized by Stanovich (2011), who distinguishes between: 

• the algorithmic mind , that is, the executive processes which allow effective 
processing of information identified as relevant (the “how?”); 

• the reflective mind , that is, the reflective processes which allow the individual 
to (re)stmcture a situation, to assign meaning to it, and to build relevant 
frameworks given, not only the external characteristics of the situation, but 
also, the individual’s own goals, values, and priorities (the “what?”). 

According to Stanovich, processes underlying the reflective mind depend on 
individual characteristics referred to as thinking dispositions. The notion of 
thinking dispositions refers to the way in which an individual interacts with the 
world. It denotes a state of mind, tightly related to different individual cognitive 
propensities, that are well identified in the literature, for example: dogmatism 
and absolutism, actively open-minded thinking and openness, need for cognition, 
flexible thinking, or belief identification (Stanovich, 2011). Thinking dispositions, 
and not just executive capabilities (involved in the algorithmic mind), would allow 
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individuals to formulate relevant goals and thinking frameworks in complex and 
new situations. For example, they predict inter-individnal differences in complex 
reasoning tasks (for example, Stanovich & West, 2008). These findings suggest 
that cognitive adaptation relies on executive capabilities (that determine “the 
processing power of the machine”) and on thinking dispositions (that determine 
the quality of the goals, frameworks, and contents “which are processed by the 
machine”). Therefore, the notion of thinking dispositions seems able to enrich 
the view of cognitive adaptation by providing a first element of response to the 
“what?” question. 

An example of cognitive-adaptation training integrating thinking dispositions 
In order to offer pilots new training to improve adaptability, we tested a training 
approach that seeks specifically to promote thinking dispositions that are 
necessary for cognitive adaptation. This cognitive-adaptation training, Mental 
Mode Management Training (MMMT, Fradin, Aalberse, Gaspar, Lefran?ois & 
Le Moullec, 2008; Fradin, Lefrancjois & El Massioui, 2006), aims at improving 
adaptation capabilities in occupational settings. It allows participants to question, 
and possibly, to modify their relationship with complex and stressful situations. 
Participants are invited, firstly, to examine their mental mode, that is, the state 
of mind with which they approach a situation; and secondly, to cultivate a set 
of attitudes favoring the adoption of a mental mode that fosters satisfactory 
performance and emotional balance in complex and unforeseen situations. The 
attitudes being considered and developed are: open-mindedness, acceptance, 
nuanciation, relativization, rationality, and individualization. 

Principles of the Mental Mode Management Training (MMMT) 

The MMMT is based on the Cognitive-Processes Scale (CPS), which is a self- 
report tool. It consists of seven Likert-type scales: six scales are used to assess 
mental mode or state of mind, and one scale is used to assess stress. The first 
six scales correspond to characterizations of the mental mode via six dimensions, 
which correspond to the six selected attitudes. Each scale involves a bi-dimensional 
axis, one end of which corresponds to the preferred attitude, while the other end 
corresponds to the opposite attitude (see Table 14.1). For example, the first scale, 
which is devoted to the “openness” attitude, opposes “routine” to “curiosity.” It 
allows individuals to evaluate whether they are approaching a situation as if it were 
routine (that is, known and mastered), or with a curious state of mind. Thus, CPS 
allows participants to assess the extent to which they were in an automatic mental 
mode (that is, in a thinking disposition adapted to the management of simple or 
well-mastered situations) or in a more adaptive mental mode (that is, in a thinking 
disposition necessary for managing new, complex, or unforeseen situations). 

Each training session consist of five steps: 

• choosing a complex or stressful situation, to which it is (or was) difficult 
to adapt; 
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• becoming (more) aware of the predominant mental mode, and of the level 
of stress experience during that situation using the CPS; 

• practicing mental-mode management techniques (see below) to reinforce 
the adaptive mental mode; 

• further evaluating the mental mode and the experienced stress in the chosen 
situation, using the CPS; 

• and finally observing the difference from pre- to post-practice. 

One MMMT consists of asking oneself questions covering the six dimensions. 
The participant is invited to question himself or herself, without trying to provide 
immediate answers. A sample question is: “If I put aside the judgment of other 
people, and I think about what is really at stake for me, what do I personally think? 
(“individual opinion” dimension). Wishing for something to happen and accepting 
the possibility that the opposite outcome might occur is one way to enhance one’s 
cognitive and emotional adaptation skills. This is why another technique aims to 
lead participants to accept the fact that one cannot succeed or possess something 
without running the risk of failing or of losing something (for more information 
about MMMTs, see Fornette et al., 2012; Fradin, 2003; Fradin et al., 2008). 


Table 14.1 Attitudes considered and dimensions of automatic mental 
mode and of adaptive mental mode 


Attitude 

Automatic Mental Mode 

Adaptive Mental Mode 

Openness 

Routine 

Curiosity 

Acceptance 

Refusal 

Acceptance 

Nuanciation 

Dichotomy 

Nuance 

Relativization 

Certainty 

Relativity 

Rationality 

Priority to results 

Logical reasoning 

Individualization 

Social image 

Individual opinion 


Effects of the Mental Mode Management Training (MMMT) on performance and 
stress management 

A first evaluation of MMMT effects on flight performance and stress management 
was carried out in a sample of French Air Force pilot cadets (Fornette et al., 2012). 
The main methodological features and results of this study may be summarized 
as follows. The class of pilot cadets (N = 21) was divided into two groups: a 
Training Group (TG), which participated in six two-hour training sessions, and a 
Control Group (CG), which did not receive training. Both groups were balanced 
with respect to emotional profiles, initial performance, and instruction-squadron 
membership. Within each group, cadets were further divided into two subgroups 
(High- and Low-performance level) based on the median score of the class. This 
resulted in four subgroups: TG-Low, TG-High, CG-Low, and CG-High. In-flight 
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performance (in the form of scores ranging from 0 to 20, which were assigned by 
flight instructors) was measured; so were mood, anxiety, and stress-management 
mode, using questionnaires (POMS (Profile of Mood State), STAI-Y-A (Spielberger 
State Anxiety Inventory), and specific questionnaires, respectively). 

A comparison between in-flight performance before and after training showed 
a significant improvement (p < .05) for the lowest-ranked cadets in the training 
group (TG-Low). For the three other subgroups, no significant change was 
observed. Pre- and post-training flight score means of the four subgroups are 
shown in Table 14.2. The improvement for the TG-Low group persisted until the 
end of the basic flying program (that is, 1.5 months after the end of the training). 
Mood and anxiety scores did not differ significantly between the training and 
control groups. However, the number of cadets who reported having changed their 
mode of stress management during the study was significantly higher (p < .05) for 
the training group (80 percent) than for the control group (27 percent). Moreover, 
70 percent of the training group cadets stated that the cognitive-adaptation training 
had allowed them to better understand events and, consequently, to reduce their 
stress level. 


Table 14.2 Flight score means and standard deviations for the low- and 

high-level subgroups of the training and control groups during 
the pre- and post-training phases 


Subgroup 

N 


Phase 


Pre-training 

Post-training 

M 

SD 

M 

SD 

TG-Low (Training Group—Low Level) 

6 

13.33 

0.67 

14.18 

0.53 

TG-High (Training Group—High Level) 

4 

14.03 

1.01 

13.65 

0.67 

CG-Low (Control Group—Low Level) 

5 

13.55 

0.83 

13.43 

0.41 

CG-High (Control Group—High Level) 

6 

14.34 

0.66 

14.33 

0.59 


These results suggest that the MMMT may have some beneficial effects: (1) on 
flight performance of cadets who have more difficulties during flights than other 
cadets; and (b) on stress management of all TG cadets. Although no change in flight 
performance was observed in the TG-High trainees, these cadets reported that the 
MMMT had helped them to better grasp situations. Thus, one may speculate that 
the MMMT contributed to provide new cognitive-adaptation strategies for TG- 
Low trainees, and to enhance the awareness of, and to enrich, already existing 
strategies for TG-High cadets. 

The significant improvement in mood over time in the two groups may have 
been the result of the cadets progressing through the flying program, and realizing 
that they would succeed in completing the training. This is consistent with the 
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findings of R. A. Williams et al. (2004). There was no significant change in state- 
anxiety in the two groups. It might be the case that the STAI-Y-A questionnaire 
was not sufficiently specific to capture subtle differences in anxiety levels among 
this population, who fell into the low state-anxiety category. 

This first evaluation suffers from the following shortcomings: (1) the studied 
sample was small; (2) no alternative training group was included; (3) participants 
were not provided with practice in simulated or real situations; and (4) the training 
sessions performed under non-optimal conditions, because they took place at the 
end of flying days. Results of the evaluation nevertheless suggested that MMMT 
could be a useful tool for enhancing cognitive adaptation. 

Cognitive-adaptation trainings based on mindfulness 

Mindfulness studies are related to thinking dispositions studies in that both of them 
emphasize the positive role played by open-mindedness and acceptance on adaptation. 
However, mindfulness studies extend thinking dispositions research by formulating 
hypotheses concerning attentional processes underlying this state of mind. 

Mindfulness: a state of mind and a specific attention quality 
The word “mindfulness” refers to a specific attentional practice, as well as to 
the state of consciousness targeted by this practice. Sometimes considered as an 
individual disposition, and sometimes as a generic cognitive aptitude, mindfulness 
is, in any case, a mental ability that can be developed via training. 

The goal of mindfulness practice is to cultivate an open and receptive 
attentional mode in which individuals make themselves available, and connect 
with their experience in its totality, without trying to hang on to it, or to push 
it away, without abstracting or judging it, only by staying present. Stemming 
from oriental meditation tradition, mindfulness was introduced into occidental 
psychology in the last few decades, in particular through programs aiming at 
improving the emotional and clinical states of patients. By learning to focus their 
attention totally on their present experience without being swayed by memories 
or ideas, in a state of open-mindedness and acceptance (Kabat-Zinn, 2003), 
participants see their state improve. Such interventions are effective at reducing 
stress and at improving mental health in both patients and healthy individuals 
(see two recent meta-analyses, Grossman, Niemann, Schmidt & Walach, 2004; 
Hofmann, Sawyer, Witt & Oh, 2010). 

Mindfulness mechanisms 

The positive findings observed in clinical studies may be explained by mindfulness 
effects on attentional and executive functioning. Indeed, mindfulness is a form of 
attentional training. Studies show that mindfulness practice results in a greater 
effectiveness in orienting attention, an increased capacity to sustain attention, 
and a cognitive flexibility improvement (for example, Jha, Krompinger & Baime, 
2007; MacLean et al., 2010; Moore & Malinowski, 2009, respectively). 
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Furthermore, Wenk-Sormaz (2005) showed that mindfulness training leads to 
a reduction in habitual responding on cognitive tasks. Herndon (2008) showed that 
mindfulness improves cognitive performance because it leads individuals to better 
take into account important details of situations. In the context of high-reliability 
organizations, Weick and Sutcliffe (2006) estimate that mindfulness is useful for 
managing unexpected situations because it encourages individuals to: (1) keep 
in touch with deviating elements; (2) not distort reality to make it conform to 
available concepts; and (3) identify automatic reactions and associations. 

Mindfulness training in risky environments 

Studies have begun to investigate relationships between mindfulness and 
performance, emotional balance, and health at work. For “healthy employees,” 
mindfulness is positively correlated with physical and mental health, as well as 
emotional balance and effective stress management (for review see, Oberdan & 
Passmore, 2010). Moreover, Passmore (2009) reported that mindfulness has been 
shown to have beneficial impacts on several areas involved in work performance, 
such as learning, safety culture, conflict resolution, creativity, and decision making. 

In this context, some studies have recently been devoted to introducing 
and evaluating Mindfulness Training (MT) in military personnel. Jha, Stanley, 
Kiyonaga, Wong, and Gelfand (2010) proposed a MT program, for both improving 
operational effectiveness and building resilience to stressors in a high-stress 
military pre-deployment context. The evaluation of this training showed beneficial 
effects. In particular, the training was found to increase working memory capacity 
and positive affect, and to decrease negative affect and perceived stress (see also, 
Stanley, Schaldach, Kiyonaga & Jha, 2011). However, these beneficial effects 
were observed only for military participants with high MT practice time. In 
Norway, preliminary results of a first study in a military F-16 fighter squadron 
showed that a 12-month MT may be a viable method to cultivate concentration and 
arousal regulation among individuals who are already scoring high on such skills 
(Meland, Fonne & Pensgaard, 2012). MT may also help to protect individuals 
against future functional and relational impairments associated with high-stress 
contexts. However, MT seems to have negative effects among subjects who are 
not motivated to perform the training. On this basis, a more targeted and shorter 
(three-month) MT was developed, and new studies are ongoing to evaluate its 
effects on cognitive function and stress among military pilots. 


Contribution of these New Perspectives on Training 

Toward a new paradigm for cognitive-adaptation training 

MMMT and MT rely on similar principles. The first one is to become aware, 
and observe non-judgmentally our attitudes or relationships with the environment 
(external as well as internal). The second one consists of cultivating attitudes that 
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are conducive to adaptive management of complex and unexpected situations. 
Among these attitudes, openness toward experience occupies a central place. 

MMMT and MT seem to have beneficial impacts on the cognitive and 
emotional dimensions of cognitive adaptation in professional environments, and 
in particular military activities. Moreover, they respond to some of the limitations 
of traditional training. Indeed, the latter are centered on the mere training of 
cognitive control and they assume that individuals can have as many heuristics 
or questioning schemes at their disposal as situations that they can be faced with, 
can choose the right scheme, and can engage attentional resources in a sustained 
fashion. In contrast, the goal of the new training approach, like MMMT or MT, is 
not to reinforce cognitive control by providing specific schemes but rather to focus 
on the development of attitudes, which allow individuals to identify a relevant 
framework for the analysis of the situation, and thus to orient their attention toward, 
and to control, useful information (Stanovich, 2011). This training approach offers 
an answer to the question of “what” to control. 

Admittedly, the “traditional” and “new” approaches have the same goal: 
to help operators to take into account the “right” information for managing the 
situation. However, these approaches guide operators in two very different ways. 
The first approach aims to ensure that the information selected by the operator is 
correct (for example, by asking questions such as “what information did you take 
into account when making the decision?”), while the second aims to ensure that 
operators approach situations with a curious and open-minded attitude. 

MMMT allows individuals to acquire a global strategy, which is generalizable 
and transferable, in order to approach and manage situations. In parallel, the 
non-judgmental attitude fostered in MT incites operators to accept all facets 
of a situation, regardless of their a priori nature and emotional valence. This 
specific position is probably the most decisive element. It allows the individual’s 
attention not to be captured by prejudices or routines, and to be allocated to the 
processing of truly relevant information, even if this information is unusual or 
embarrassing. 

This specific state of mind seems to improve the probability that individuals 
will be able to re-structure the situation depending on current constraints rather 
than on preconceived notions. Thus, individuals can achieve cognitive adaptation, 
even in complex and unforeseen situations: performance is enhanced, emotional 
balance is maintained. This is not owing to increased control, but owing to a 
"letting go” movement, involving openness and acceptance of the “here and now.” 
Our understanding of cognitive adaptation is enhanced by this new approach: 
training centered on cognitive control and training that seeks to foster a state of 
mind open to experience provide complementary solutions. 

Applications for military pilots 

To our knowledge, this new training approach has not yet been extensively studied 
in the context of military activities but it could prove valuable in enhancing 


Enhancing Management of Complex and Unforeseen Situations Among Pilots 241 


military pilots’ ability to manage complex and unforeseen situations (Fornette et 
al., 2012; Jha et al., 2010; Meland et al., 2012). 

MMMT is a form of cognitive technique. The operator is guided through 
attitudes that favor relevant information processing in three steps: (1) curiosity 
and acceptance at the information-input stage; (2) nuanciation and relativization 
at the information-processing stage; and (3) rationalization and individualization 
at the information-output stage. MMMT focuses on the situation that must be 
managed and on the way in which the operator approaches this situation. MMMT 
techniques are analytical, concrete, and active. Therefore, they are easy to grasp 
by the military who are used to dealing with such methods. Compared to MMMT, 
MT goes deeper in fostering acceptance as an attitude, especially acceptance of 
emotions, and of physical sensations. MT techniques are, in essence, meditative 
and contemplative, and they tend to promote a “let-go” attitude. These techniques 
are more global, oriented toward feelings, and passive. They are less customary 
for the military. 

Given these characteristics, MMMT seems to be more readily acceptable 
among military pilots, compared to MT, which focuses on emotions, physical 
sensations, and "letting go.” MMMT may be thought of as an intermediate 
approach between reinforcing control and letting go. One might design training 
that combines the advantages of both approaches. This might start by using MMM 
techniques and then complete those with MT techniques. In any case, work is 
needed to adapt these new training approaches to the population of pilots. Even 
if MMMT currently includes examples, and aeronautic case studies, direct 
professional applications should be added—for example, situational exercises or 
scenarios. A better adaptation of training to military personnel will likely provide 
a means to increase the use of learned techniques, beyond the training sessions— 
which may be essential to their efficacy (Jha et al., 2010). 


Conclusion 

Given the diversity and complexity of operational situations and systems, it 
is currently accepted in civil and military aviation that one cannot train for all 
situations. It is agreed, nowadays, that it is important to reinforce the preparation 
of pilots to manage complex and unforeseen situations. Organizations and 
individuals in charge of training operators engaged in risky activities must now 
deal with the following contradiction. On the one hand, one must design training 
programs that make it possible to teach operators to master risks by following 
procedures and routines that guarantee continued safe operation. On the other 
hand, one must develop adaptability, the ability to detect and identify unexpected 
situations, and then to process these situations by identifying new solutions while 
maintaining safe operation. 

As discussed in this chapter, we tested a new type of intervention, MMMT, and 
have obtained promising results. Other types of training, such as MT, also appear 
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to be particularly interesting for developing pilots’ ability to effectively manage 
complex and unforeseen situations. These new training approaches should be the 
object of further studies in order to better understand how they operate, and to 
improve their adaptation to pilots. However, they already appear as a necessary 
complement to existing training, because they allow acquisition of skills that are 
needed for understanding and responding to complex and unforeseen situations. 
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Chapter 15 

Distribution of Attention as a Function of 
Time: A Different Approach to Measure a 
Specific Aspect of Situation Awareness 

Katrin Meierfrankenfeld, Werner Gress & Bettina Vorbach 
German Air Force Center for Aerospace Medicine, Germany 


This chapter introduces a methodical innovative approach to measure distribution 
of attention. Working base is the final stage of the German Air Force aircrew 
selection procedure for fixed-wing aircraft. Applicants are tested during a four-day 
simulator-based screening. Although this selection procedure is showing excellent 
results, it relies heavily on expert rating measures. Implementing an objective 
measurement method could significantly enhance interobserver reliability, validity, 
and test objectivity. The new approach could be supportive in flying training as 
well. During flight simulator training observer ratings could be complemented by 
objective data. The key to our approach is to use time as the unit for measuring 
distribution of attention. 


Situation Awareness and Distribution of Attention 

Situation awareness (SA) is a key concept in aviation psychology. Accidents 
and incidents are frequently explained by lack of SA (for example, Endsley & 
Garland, 2000; Jones & Endsley, 1996; Nullmeyer, Stella, Montijo & Harden, 
2005). Aircraft interface upgrades are justified by assumed increases in SA 
(Vidulich, 2003). Nevertheless, the underlying nature of SA is still debated. A 
generally accepted definition is proposed by Endsley (2000, p. 5): “[SA is] the 
perception of the elements in the environment within a volume of time and space, 
the comprehension of their meaning and the projection of their status in the near 
future.” Certain elements of the definition are still under discussion: regarding SA 
as product or process, as part of decision making, being somehow a meta-construct 
(for example, Carretta, Perry & Ree 1996), or as being independent of decision¬ 
making processes (Endsley & Garland, 2000), as hierarchical structure (Endsley 
& Garland, 2000) versus “perceptual circle” (Adams, Tenney & Pew, 1995). 

There are also different approaches to measure SA. Three main approaches are 
(1) to infer SAor a lack of it from task performance (for example, stick movements, 
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wrong control inputs, reaction times, and so on); (2) to freeze a simulated task and 
inquire which (important) details of the situation one recalls (for example, the last 
heading, altitude, and so on), also called memory probes; or (3) to use self or expert 
ratings where S A is graded based on observation of performance by experts (expert 
rating) or the pilot’s self-rating which is prone to memory effects (if asked after the 
mission) and judgment errors. According to Endsley (2000), SAcan be sub-divided 
into three hierarchical levels. The first level is to perceive (relevant) information 
in a situation. Level 2 SA is understanding the meaning of information; it can only 
work properly if based on correct and relevant information. In level 2 SA, pieces 
of information are interpreted, for example the perceptions of increasing speed 
and loss in altitude might be combined as “I am descending, hence I am getting 
faster.” Level 1 and 2 SA are mandatory to being able to predict what will happen 
in the future (level 3 SA); here, further descent might result in crashing into the 
ground. In summary, perception, comprehension, and projection as three levels of 
SA are the basis for proper decision making. Distribution of attention (DA) might 
be regarded as essential to the first level of SA, since attention is the gateway to 
perception. DA is akin to attentional flexibility or allocation of attention, which 
are often measured by visual scanning behavior (Bellenkes, Wickens & Kramer, 
1997). Salmon, Stanton, Walker, and Green (2006, p. 234) state: “[Measurement 
of eye movements is] recording the process that operators use in order to develop 
SA.” If you put it the other way around, tunnel vision is extreme loss of DA. 

Why would it be important to measure DA? Firstly, Jones and Endsley (1996) 
classified incidents recorded in the Aviation Safety Reporting System (1986-1992) 
by SA failures with the following results: 76.3 percent level 1 SA errors, 20.3 
percent level 2, and 3.4 percent level 3 SA errors. Failures of level 1 SA seem to 
be significant in incidents. Secondly, lacks in DA lead by definition to loss of SA: 
in a hierarchical structure, if you fail at level 1, levels 2 and 3 cannot be achieved. 
Even degraded SA performance might be adequate, as decisions made and actions 
taken might be the correct ones, by sheer luck or other factors. However, proper 
decisions seem to be more likely if SA is adequate. That is why DA is quite 
important in terms of perceiving necessary information for developing levels 2 
and 3 SA. This will increase the probability to take proper actions intentionally. 


Typical Measurement of Distribution of Attention (DA) 

Inasmuch as DA is essential for SA, measuring it is also essential. Self-ratings 
might be a dangerous way to assess DA and/or SA: one might lack SA because 
one does not acquire quite important information (failure at level 1 SA), and does 
not have a clue that something is missing. Experts’ ratings, on the other hand, infer 
SA and DA from performance, but performance (in a complex, non-experimental 
environment) is occasionally influenced by many factors, such as psychomotoric 
skills, flying experience, routine, decision making, speed of information processing, 
and so on. F urther, observer errors in classifying performance might occur. 
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One solution might be using eye trackers to record eye movements to allow 
evaluating such things as dwell times and saccades. This seems to be more 
objective as there is no need for an observer. Blit there are several disadvantages. 
As flying is a dynamic process, in some situations it is necessary to fixate on 
specific information, while the same behavior might be totally wrong in another 
phase of flight. For example, Bellenkes et al. (1997) found flexible modulated scan 
strategies by expert pilots, who allocated their attention according to maneuvers. 

A more severe problem is that “to see” does not necessarily mean “to perceive.” 
Not every piece of information available to the eye is encoded. Another approach 
to measure DA is multi-dimensional tracking (for example, altitude, speed, 
heading) based on deviation measures. It is performance based, therefore it might 
also be influenced by variables other than DA, but it seems to avoid judgment 
errors. However, the question of comparing units remains: for example, how to 
build a score of 10-degrees heading deviation, 5-knots speed deviation and/or 100- 
ft altitude deviation? 

In flying training and selection programs expert ratings are frequently used 
(for example, FAA flight test standards, Federal Aviation Administration, 2002). 
As typical observer mistakes can occur, objective data would help to increase 
objectiveness, reliability, and validity. The following section describes the 
development of an approach to measure DA objectively. A brief introduction into 
the German Armed Forces’ aircrew selection process and a description of the flight 
simulator used in this process will first be provided. 


Measuring Distribution of Attention (DA)—a Different Approach 

Aircrew selection in the German Armed Forces 

German Armed Forces’ aircrew selection procedure consists of three phases. 
Phases I and II include the officer selection and assessment of basic flight-specific 
aptitudes as well as an aviation-medical examination. Phase III for fixed wing 
aircraft' is more complex. It is a one-week simulator-based screening in a typical 
training scenario with the objectives of (1) to test if applicants are qualified for 
any cockpit position; and (2) to determine suitable future flying assignments (jet 
pilot, weapon system operator/navigator or transport pilot). For those purposes, 
applicants have to demonstrate their skills in academic training and in four flights 
in the simulator with increasing workload. As in real flying training, a briefing, a 
demonstration, and a practice phase, as well as subsequent debriefings, prepare 


1 Two Phase Ills exist: One, as described here, for applicants for fixed wing aircraft, 
and another Phase III for helicopter pilot applicants. Both Phase Ills are quite similar except 
of differences reflecting specific demands in rotary versus fixed wing flying, operational 
flying and, of course, cockpit layouts in use. This chapter refers to Phase III for fixed-wing 
aircraft only. 
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applicants for their check phases. The overall aim is to evaluate specific capabilities 
in a complex scenario similar to real flying training. 

Another aim of Phase III is to minimize attrition rates during basic flight 
training. Long-term evaluation has proven the aircrew selection process to be 
adequate. Attrition rates during flying training are low (for example, in ENJJPT 2 
from 2007 to 2012: 5.4 percent total and 3.8 percent due to flying deficiencies; 
compared to up to 35 percent in previous, outdated selection procedures). 
Approximately 200 applicants are tested in Phase III annually. Looking at the 
applicant’s typical profile in Phase III, he or she is just about to graduate from 
college with an average age at about 19 to 20 years with no flight experience. Few 
applicants are active duty soldiers. 

The simulator-based screening takes one week; six applicants are tested per 
week. The entry-level is the pre-selected “pedestrian,” who ought to be able to fly 
missions including tactical elements rather quickly. So the pace of progress has to 
be fast, thorough preparation and knowledge are required (a text book is provided 
to the applicants for study three to four weeks prior to the selection phase). 
Sequence of Missions: Mission 1—familiarization with the simulator static and 
dynamic. For example: cockpit lay out, taxiing, takeoff, acceleration, climb and 
level turns are practiced. Mission 2 consists of traffic pattern procedures (from 
taxiing to full stop landing), evaluating mainly procedural skills. Missions 3 and 4 
are missions including tactical elements requiring more flexibility to demonstrate 
information management and decision making. Dynamic maneuvering, including 
recoveries from unusual attitudes, trail formation, intercept and attack of a 
simulated hostile aircraft are elements in Mission 3. In Mission 4, a low-level 
navigation route with additional tactical tasks that are unexpectedly called up 
during the mission task saturate the applicants. Both missions (3 and 4) are highly 
dynamic, requiring proper reactions. Each of the above-mentioned missions have 
specific requirements as well as common parts that remain the same from Mission 
1 to Mission 4 (for example, takeoff). Learning and applying procedures are 
essential. The completion of the selection phase is marked by a Selection Board 
consisting of an Aviation Psychologist, an experienced military Jet Instructor Pilot/ 
Navigator and the Military Training Staff Officer to make the final decision on 
passing or failing the selection process and to determine best suitable future flying 
assignments. The pass/fail decision as well as grading (seven-scaled, 1 = excellent 
and 7 = unsatisfactory, and five-scaled, A = excellent, B = good, C = average, D = 
marginal and U = unsatisfactory) and proposals on cockpit assignments are based 
on aptitudes, performance, and progress during the week. Among the applicants 
who successfully completed Phase III, only the very best will be selected by the 
Human Resource Department to join a specific flight training track according to 
hiring needs. 


2 Euro Nato Joint Jet Pilot Training (ENJJPT) is a multinational, very demanding 
training program for future fighter pilots. 
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Flight simulator used in this study: The Aviation Psychological Pilot Selection 
System/ Fixed Wing (FPS/F) 




Figure 15.1 The flight simulator used in Phase III/ fixed wing (FPS/F) 

consists of cockpits with a high-quality screen (15.1a), and the 
multifunctional display showing expanded instrumentation as 
well as touchscreen and radio panel (15.1b) 
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The FPS/F (Aviation Psychological Pilot Selection System/ Fixed Wing) is a flight 
simulator consisting of four cockpits with lockable canopies, a spherical five- 
channel high-resolution outside projection system with 200 degrees horizontal and 
45 degrees vertical field of view, which matches the capabilities of the human eye. 
This simulator is no training device—the FPS/F uses a simple generic flight model: 
the cockpits are equipped with basic flight controls (with an optional side stick), a 
gear and flaps button, and a multifunctional touchscreen display (MFD) offering 
basic flight instrumentation, radio control, and various additional sections, which 
allow for checking engine, fuel, or electric system, displaying radar or mission 
instructions (on programmable text pages; see Figure 15.1). In Mission 3 a pipper 
is displayed for aiming purposes. In Mission 4 a head-up display (FlUD), radar 
altitude, and indicated air speed are introduced. The instructor consoles enable 
mission control as well as monitoring of the applicant’s activities and performance. 
Missions are debriefed without delay providing replay features at the debriefing 
station. Further data can be analyzed at an evaluation station. 

The aircraft simulated is a generic prop aircraft, single-seat, single engine 
with retractable landing gear. An automatic trim feature is implemented to ease 
aircraft control. So as long as a certain flight attitude is set and no flight control 
inputs are made, the aircraft tries to maintain this attitude; the system should not be 
compared with a standard autopilot. The intention is to keep the simulator simple, 
to enable the applicants to fly the simulator in missions that require handling 
standards above basic flying capabilities rather quickly (see Figure 15.2). To 
reduce complexity, there is no torque-effect, no trim necessary, and weather is 
always fine. Flowever it is noteworthy to mention, the FPS/F capabilities have 
a growth potential to evaluate specific flying aptitudes in a more sophisticated 
environment, just as deemed necessary for the intended mission objective. 

For screening purposes, standardized missions are used to facilitate maximum 
standardization even though the simulator reacts adaptively regarding the 
applicant’s behavior. A LUA 3 -based mission editor can be used to design new 
tasks, missions, and evaluation matrixes to expanding the system's capabilities. 

Construction of the measurement 

Construction of a DA measurement requires several steps. A task analysis for 
maneuvers and sub-maneuvers has to be conducted. It is necessary to differentiate 
between the (sub-) maneuvers and to detect transition phases between those (sub-) 
maneuvers (it would not be very helpful to apply the same DA algorithm on the 
entire maneuver as requirements for DA will change during the maneuver). The 
second step is to define sets of parameters necessary for a proper DA for each 
(sub-)maneuver: as flying is a dynamic process, the parameters required for DA 
vary with the requirements of the tasks (see Previc et ah, 2009). Because it is 
usually not possible to maintain precise values set in the parameters (except 


3 LUA is a programming language. 
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Sequence of Events 



Figure 15.2 Phase III portrays fairly realistily the everyday routine of 
a student pilot in a complex flying training environment: a 
combination of academics (including tests) and flying training 
(demonstration phase, practice, evaluation, and debriefing) 

of digital parameters such as gear status up/down), it is more useful to define 
acceptable ranges around each parameter (note: the magnitude of these ranges also 
defines the difficulty of the task). 

Finally, the algorithm for calculating this time-based DA score can be applied. 
It measures the time increments where DA fails. In this model, DA fails if one or 
more parameters exceed the defined ranges and the parameters are not corrected. 

Defining the score: Some examples 

Sub-maneuvers, sets ofnecessary parameters, acceptable ranges, and corresponding 
proposals on grades are defined in behaviorally-based rating scales (BARS). BARS 
are used by the experts to rate the respective flying maneuvers in Phase III. Sub¬ 
maneuvers and sets of parameters as described in BARS were reviewed by experts 
who supplemented or eliminated parameters. In addition, a literature review 
supported the chosen parameters. Table 15.1 shows some examples for parameters 
used in different flight phases. It is not possible to describe all deviations and 
interactions for each and every maneuver. In the following paragraph only a few 
examples are shown: configuration changes in level flight result in lift changes, this 
has to be compensated for by changes in angle of attack. During acceleration and 
deceleration in level flight, the angle of attack has to be adjusted to compensate for 
lift gain or loss. Speed has to be monitored and power settings adjusted to reach 
or maintain the desired speed. Heading has to be monitored although no changes 
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are expected, but some unintended control inputs could have caused deviations. 
During turns, the need to check parameters increases. Altitude and/or vertical 
speed should be held constant; therefore, power and angle of attack are increased 
as angle of bank increases. Both have to be stabilized in turn; by rolling out, angle 
of attack has to be decreased for straight-and-level flight attitude. Thus, altitude 
and vertical speed, angle of bank and their interactions have to be monitored or 
controlled. Heading change has to be monitored to roll out at the desired heading. 
Again, if deviations of one parameter occur, they should not be corrected without 
regard to the other parameters. If corrections are made, one has to anticipate 
reaching the desired parameters. For example, one should stop turning once the 
desired heading has been reached, and stop climbing or descending once the 
desired altitude has been reached, this again requires anticipation. Furthermore, 
basic aerodynamic principles should be taken into consideration: for example, if 
one climbs accidentally, then one would lose speed if the power setting is not 
adjusted. 

Table 15.1 Sets of criteria (examples) and acceptable deviations around 
them as basis for the DA-score 


Maneuver 

Set of Variables 

Range 

Level Flight 

Altitude 1,000 ft 

+/- 20 ft 


Indicated Air Speed 130 kts 

+/- 3 kts 


Required HDG 

+/- 1° 

Level Turn 

Altitude 1,000 ft 

+/- 20 ft 


Indicated Air Speed 130 kts 

+/- 3 kts 


AOB 30° 

+1-2° 


R/O: Required HDG 

+/- 1° 


Note: HDG = Heading. AOB = Angle of Bank. R/O = Roll Out. 


Takeoff and final approach have specific peculiarities. Both are quite dynamic 
as speed and configuration of the aircraft are changing. Furthermore, gear and 
flaps buttons have to be used, thus are calling for attention resulting sometimes in 
unintended control inputs. 

Level flight seems to be a quite undemanding maneuver; nevertheless, there 
is room for errors. This maneuver is described in more detail below to illustrate 
possible deviations in a generally considered simple maneuver and to explain how 
the score is working. In level flight, altitude, vertical speed, speed, and heading 
should be checked to maintain preset values; deviations must be corrected. 
Typically, attention diverts and fixates on landmarks, in-flight checks, or radio 
transmissions. Consequently, the aircraft deviates unnoticed from desired flight 
path. 


All desired parameters are within their ranges. No failure of DA would 
be recorded. 
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• Speed is low because of wrong power setting. Both are not detected. Failure 
of DA is recorded as soon as speed is beyond the defined value until the 
correction is started. 

• Altitude increases or decreases, thus speed is decreasing or increasing 
respectively. Failure of DA is recorded as long as the defined ranges in 
altitude and/ or speed are violated, until correction starts. Correction of 
altitude without correction of speed via power settings is sufficient, as 
speed deviation is caused by altitude deviation. 

• Altitude has increased, speed is therefore low. Speed is corrected by power 
setting, altitude is not corrected: failure of DA is recorded because the 
applicant fixates on speed and ignores altitude. 

• Altitude has increased, speed is therefore low. Speed is corrected by power 
setting, without regarding to altitude. With delay, altitude is corrected. 
Subsequent speed is corrected again by power setting. DA would be 
considered to fail up to the beginning of the second correction of speed 
(as far as deviations are out of limits), because previously, at least one 
parameter is out of limits and not corrected. Correction of altitude first 
would lead to less time failing. 

• Minor pilot-induced oscillation (PIO) that could be due to inadequate 
psychomotor skills is tolerated. Failure of DA is recorded as soon as altitude, 
speed, or heading deviations exceed the acceptable ranges until there is 
a corrective action in the correct direction. Similar problems exist with 
multitasking: because corrections or changes in one axis of control could 
lead to changes in another axis—if not carried out swiftly—the applicant’s 
workload could increase significantly causing violation of limitations. 

In summary, deviations caused by other factors than DA should generally not 
affect the algorithm. On the other hand, the algorithm cannot differentiate between 
causes for attention fixations: is it “pure” DA, is there a lack of concentration, is 
there an interaction between psychomotor skills or multitasking deficiencies and 
DA? A certain degree of interdependency is inherent in the concept of DA. 

Defining distribution of attention: Mathematical description 
The algorithm used basically computes the time within a maneuver where (1) at 
least one of the relevant parameters (Para) exceeds the limits of the acceptable 
range defined by the upper and lower limits (LL and UL); and (2) the distance of 
this parameter from the acceptable range (APara) is increasing (that is, the pilot is 
not correcting). Thus, the unit for this (raw) score is simply seconds. The benefit 
of this approach is also to avoid the usual problem with units in multi-dimensional 
tracking experiments. In mathematical terms the simplified definition of the DA- 
measure looks like this: 

DA (t)= y'fitj | {^{Para J=u < LL^v^Para J=li > ULj ) j a | AParcq =l; > APara \. =l 
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...{^Para mj <LL m )w ( Para m t > UL m ) j a | A Para m i > APara m j 

V Para p LLjJJLj e {Parct x ,---,Para m }, {LL { ,---,LL nl }, {UL X ,... ULJ 
Where: 


• {Para v ...,Para } is a set of parameters necessary for a proper distribution 
of attention during a specific maneuver. {LL ] ,...,LL m \ and {UL 1 ,...,UL } are 
the corresponding lower and upper limits defining the acceptable range for 
Parcij. The number of relevant parameters is defined by m. 

• At, -f (H) is the duration of a time frame from the previous to the 
present point of time. t Q defines the start of the maneuver, t the end of 
the maneuver. 

• Paraj. represents the value of Para at time i. 

• APara j; := minlllJLj — AParaj ( , ^LL j — APara^ ) is the distance of the 
parameter Para , from the acceptable range for Para. defined by UE and 
LL at time i. 

• AParaj^, .- min( ULj— AParaj j , LL^ — APara f ( )i is the distance of 
the parameter Pai^. from the acceptable range for Para^ defined by UL 
and LL at (previous) time i-1. If APara j . is smaller than APara j . p then the 
values of Paraj are approaching the acceptable range. 

Basically the algorithm starts counting time, if at least one relevant parameter 
exceeds the acceptable range and is not approaching the acceptable range. 4 , 5 A 
proportion of time can easily be calculated by dividing the raw score by the total 
duration of the maneuver. 


Empirical Findings 

Study 1 — Testing the new score in Mission 2 (traffic pattern mission) 

Hypothesis 

An objective time-based score for DA should correlate with the experts’ ratings 
of DA, with performance in Mission 2, and with overall success in Phase III. The 
last mentioned correlation is expected to be only small or medium, as Mission 2 


4 {ZZ 1 ,...,ZZ (n }APara.. are only meaningful defined, if the value of Para^. exceeds 
the acceptable range. Before actually calculating differences/distances/relational operations 
for angular parameters such as heading, bank, or pitch the overflow (for example 360 
degress for heading) have to be caught. 

5 This definition can be easily extended as to the degree of failure of DA. For example, 
a situation when two or more parameters are deviating from the acceptable range usually is 
a more severe failure of DA than a situation when DA fails for only one parameter. 
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is only one of four missions and additional tests in academics are used to assess if 
one is qualified to become a German Air Force’s air crew member. 

Expert ratings in Phase III 

Experts grade each maneuver, each pattern, overall performance in Mission 2, as 
well as DA in Mission 2 on seven-point scales (1 = excellent and 7 = unsatisfactory). 
Other aptitudes assessed during Phase III are not reported here. Success in Phase 
III is based on performance and progress from Mission 1 to Mission 4 according 
to expert ratings. It also ranges from 1 to 7 (1 = excellent and 7 = unsatisfactory). 
Grades 6 and 7 indicate failure in Phase III and no qualification for any cockpit 
position. For the Human Resource Department administering the applicants, the 
overall results are clustered in grades (A = excellent, B = good, C = average, 
D = marginal and U = unsatisfactory) corresponding to the seven-point scale on 
success from Mission 1 to Mission 4 and results in academics. 

Results 

Data from all applicants who participated in Phase III from January to April 
2012 were used. From 52 applicants, results were 1 “A/excellent,” 4 “B/Good,” 
11 “C/average,” 8 “D/marginal,” and 28 “U/unsatisfactory.” There was one 
female applicant. Applicants were young adults, aged from 18 to 24 (M = 20, SD 
= 1,96). Eighty-three percent had a qualification for university entrance, seven 
had a secondary school-level certificate (13 percent), and two (4 percent) had an 
advanced technical college entrance qualification. 

DA graded by experts ranged from 2 to 7 (M = 4.8, SD = 1.42). Performance 
in Mission 2 ranged from 1 to 7, with a mean of 4.9 (SD = 1.57). Average grade 
in pattern 1 was 4.8 (SD = 1.31), 4.7 in pattern 2 (SD = 1.32) and 5.1 in pattern 
3 (SD = 1.6). A composite score consisting of DA-scores (as a proportion of time 
with value 0 indicating no failure in DA during the sub-maneuver and 1 indicating 
failing permanently during the sub-maneuver) in Mission 2 was calculated. 
This score for DA in Mission 2 could theoretically range between 0 (all relevant 
parameters within the acceptable ranges or correcting toward the desired values) 
and 57 (always exceeding the acceptable ranges and not correcting). In this sample, 
the score ranged from 9.85 to 31.56 (M = 20.6, SD = 4.59), with lower scores 
indicating a better DA. Further, mean pattern-wise scores were computed and are 
7.60 for pattern 1 (SD = 2.16), 7.30 for pattern 2 (SD = 1.77), and 7.44 for pattern 
3 (SD = 2.06). Composite maneuver-wise scores were computed, too. Correlations 
between scores as well as experts ratings are shown in Table 15.2. The computed 
score and the experts’ ratings have significant and large correlation of .70, with 
a < .01 (Cohen, 1988) and 49 percent shared variance. The smaller the score 
(that is the better the DA), the better the experts’ rating of DA. Furthermore, the 
computed composite score and success in Phase III correlate .53 (a < .01), that is 
28.1 percent shared variance. As expected, a low composite score (meaning good 
DA) and good grades (indicating good performance in Phase III) are associated. 
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The experts’ grades for DA correlate .72 with success in Phase III (a < .01), that is 
51.8 percent shared variance. 

Correlations between experts’ ratings of DA and pattern-wise and maneuver- 
wise composite scores were medium to high and overall significant (a < .05; see 
Table 15.2). Among those correlations, the correlations of the composite score 
of pattern 2 and of the composite scores consisting of legs in pattern 3 with the 
experts’ ratings are particularly high (> .60, a < .01). 

Study 2—Taking a closer look at the final approaches of Mission 2 

In this study, we are taking a closer look at a specific maneuver in Mission 2 
(traffic pattern), namely at final approach. DA is paramount on final approach. 
Before rolling out on final, landing configuration (gear and flaps down) has to be 
established, which requires multitasking and psychomotor skills. During finals, 
applicants “just” have to monitor their parameters. They should have established 
the required rate of descent, being on glide path with the appropriate speed, and 
heading toward the centerline. During this process some checks and/ or radio 
transmissions have to be accomplished. In first attempts in study 1, finals were 
defined via vertical velocity, indicated air speed, and heading, not taking into 
account some important interactions during final. Correct final speed is achieved 
if power setting and pre-calculated rate of descent is correct. But deviations from 
glide path require immediate adjustment of the descent rate, which usually leads 
to changes in speed, too. Thus, besides taking a look at descent rate and speed, it 
is also important to consider the interaction between descent rate and glide path. 
Hence, the DA score used for finals in this study is based on air speed, glide path, 
vertical velocity, and heading. 

Hypothesis 

Correlations between the computed score and DA as graded by experts in Mission 
2, throughout Phase III, and success in Phase III were calculated. It was expected 
that medium to high correlations would exist, as finals are only one part of Mission 
2, whereas the experts’ ratings are based on Mission 2 (three pattern flights with 
full stop landing). The experts’ ratings of success in Phase III and DA in Phase III 
are even based on four missions. 

Results 

A sub-sample of 27 applicants participating in Phase III in July 2013 was used, 
with results two “B” (7.4 percent), five “C” (18.5 percent), four “D” (14.8 percent), 
and 16 “U”/unsatisfactory (59.3 percent). There was one female applicant. Mean 
age was 19.8 years, ranging from 18 to 22 years (SD =1.18). Most applicants had a 
qualification for university entrance (92.6 percent). Study 1 suggested that pattern 

1 seemed to be less important for building scores of DA. Therefore only patterns 

2 and 3 were taken into consideration. In this sample, DA as graded by experts 
ranged from 3 to 7 (M = 5.3, SD = 1.23). Performance in Mission 2 ranged from 


Table 15.2 Correlations between computed scores of DA, the experts’ ratings, and success in Phase III 
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2 to 7, with a mean of 5.4 (SD = 1.27). Average grade in pattern 1 was 4.9 (SD = 
1.17), 5.0 in pattern 2 (SD = 1.31) and 5.5 in pattern 3 (SD =1.31). The correlation 
between the computed score and average DA in Phase III (from Mission 1 to 
Mission 4) was high (.67, p < .01; r 2 = 44.9 percent), even with success in Phase 
III (.53, p<-01; r 2 = 28.1 percent) and with DA in Mission 2 (.55, p<.01; r 2 = 30.3 
percent). It is remarkable that one small part of Mission 2 correlates that highly 
with the overall results, especially in a small sample. 


Conclusion and Discussion 

First results concerning the development of a time-based measurement of DA are 
reported. Correlations between the computed scores and the experts’ ratings were, 
as expected, large and significant. Computed scores in Mission 2 even correlated 
highly with success in Phase III, which comprises four missions. Although DA 
is a quite important construct for results in Phase III, it is surprising that scores 
based on just one part of one mission correlate so strongly with results in Phase 
III (for example, with almost 30 percent shared variance). Consequently, it seems 
to be a promising approach to use a time-based score for assessing DA. Yet, 
some questions remain. First, there are some conceptual considerations. Flying 
is generally dynamic, even in “easy” maneuvers like level flight or level turns 
in Mission 2. Observing basic aerodynamic laws as well as using the T-Scan 
type crosscheck are helpful for an adequate DA, as calculated by the algorithm. 
Thus, it could be argued that the algorithm does not only calculate level 1 SA, the 
“perception of elements in the environment within a volume of time and space” 
(Endsley, 2000, p. 5), but actually measures level 2 SA: “the comprehension of 
their meaning.” Considering the term perception of elements, it does not mean that 
it is essential, that every element within a period of time will be or even could be 
perceived. For example, van Dijk, van de Merwe & Zon (2011, p. 344) state, “the 
pilot does not need to know everything” (for example,“the co-pilot’s shoe size”). 
They summarize what is important to a pilot: “pilots must be aware of critical flight 
parameters, the state of their onboard systems, their own location and the location 
of important reference points and terrain, and the location of other aircraft. This 
information forms the elements they [the pilots] need to perceive to have good level 
1 SA” (van Dijk et ah, 2011, p. 344).Thus, there might be an inherent assumption 
in Endsley’s definition concerning the perception of only some, namely important 
flight parameters. Pew (2001, p. 35) also talks about awareness of the “current 
state of the system (including all the relevant variables)”—thus also assuming 
some judgments concerning the relevance of information. Klein (2001, p. 52) has 
a somewhat different point of view; he states that level 1 SA requires level 2 SA to 
“determine which objects are relevant.” He emphasizes the importance of context 
(p. 52): “what counts as an error is a function of the task being performed,” giving 
the example of a 100-ft deviation in altitude flying at 32,000 ft or preparing to 
land on an aircraft carrier (where 100-ft deviation might be fatal). Thus he claims 
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that measures of level 1 SA should not only measure SA as sum of the elements 
that are correctly recorded, because the context is quite important. In conclusion, 
according to Klein (2001) the reported score represents a mixture or interaction 
between level 1 and level 2 SA. Besides, the score fulfills Klein’s request to take 
the context into account by defining necessary parameters and acceptable ranges 
around them for each sub-maneuver. 

In Study 2, final approaches were presented as an “ideal” maneuver to measure 
DA. Are similar maneuvers more or less appropriate to measure DA? This could 
be tested by computing factor structure of maneuver-oriented scores as well as 
correlations of scores with other constructs like concentration or multitasking. 
Can “critical flight phases” for DA and other aptitudes be identified? This will 
have consequences for DA-scores. The score might consist of all maneuvers of 
a mission (as reported in Study 1), of all maneuvers with a form of weighting, 
or of only some maneuvers. Future studies will have to be based on much larger 
samples to try to answer these questions. As described, more sophisticated methods 
than just correlation reports are to be used. Furthermore, the relationship between 
expert ratings and computed scores should be examined in more detail. The aim is 
to combine advantages from both measurement methods, the one being rather an 
objective, the other rather a subjective measure. Therefore, it is really important to 
answer the following questions. What is the relevance of the computed score, does 
it explain variance within the applicant’s performance beyond observers’ ratings, 
is there a gain in incremental validity? There is a higher correlation between 
success in Phase III and experts’ ratings of DA than between success in Phase III 
and the computed score (see Table 15.2)—can this result be confirmed? Is this 
finding due to implicit weighting done by the experts while there is no weighting 
in the computed score yet? Is it due to methodology effects? Observers’ scales rely 
on observation method, especially success in Phase III is based on results in all 
missions—so there is of course some kind of confounding. Is observation biased 
by typical observation errors? What about predictive validity as to later training 
results? It would also be interesting to use the algorithm in different configured 
simulators. These questions open a wide field for further research; they even can’t 
go unanswered in order to establish and validate the new measurement method. 

In summary, measuring DA by defining flight path (including definition of 
parameters and acceptable deviations) in a test simulator and using this information 
to produce a digital read out of actual performance within DA is a step forward to 
proliferate the consolidated knowledge about one of the key concepts in aviation 
psychology—SA. 
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